On  the  Cover 


■  A  SYNTHETIC-APERTURE  RADAR  (SAR) 
image  of  a  golf  course  in  Stockbridge,  New 
York  was  collected  by  the  Lincoln  Labora¬ 
tory  Kj-band  airborne  radar,  which 
transmits  and  receives  horizontally  and 
vertically  polarized  signals  to  produce  three 
unique  polarimetric  images  (HH,  HV,  and 
VV)  of  a  scene.  The  polarimetric  images  arc- 
combined  by  using  the  polarimetric 
whitening  filter,  which  reduces  the  speckle 
inherent  in  high-frequency  SAR  imagery. 
Two-by-rwo  noncoherent  averaging  of  the 
data  reduces  the  speckle  further.  The 
combination  of  polarimetric  whitening  and 
noncoherent  averaging  produces  a  SAR 
image  with  near-optical  quality  (see  the 
optical  photograph  below  for  comparison). 
Unlike  optical  sensors,  however,  the  S\R 
can  produce  high-quality  imagery  day  or 
night,  under  all  conditions,  including 
through  dense  clouds  or  smoke. 

The  SAR  data  were  collected  from  a 
range  of  7  km  at  a  depression  angle  of  20°. 
Range  in  the  SAR  image  increases  from  top 
to  bottom;  therefore,  the  radar  shadows  of 
trees  and  other  objects  extend  toward  the 
bottom  of  the  image.  Some  features  visible 
in  this  image  are  a  street,  a  parking  lot,  the 
golf-course  club  house,  a  putting  green 
adjacent  to  a  water  hazard,  and  a  fairway  to 
the  right  of  the  water  hazard. 

A  large  database  of  tactical  target 
imagery  and  Stockbridge  clutter  imagery 
has  been  used  to  develop  and  test  automatic 
target  recognition  algorithms.  The  article 
entitled  “Performance  of  a  High-Resolution 
Polarimetric  SAR  Automatic  Target 
Recognition  System,”  by  Leslie  M.  Novak, 
Gregory  J.  Owirka,  and  Christine  M. 
Netishen,  presents  an  overview  of  an  end- 
to-end  automatic  target  recognition  system 
and  a  summary  of  the  performance  results 
of  the  recognition  algorithms. 


EDITORIAL  BOARD 

Roger  W.  Sudbury,  i  hair 
Peter  K.  Blankenship 
Charles  F.  Bruce 
Vincent  W.S.  Chan 
John  C.  Fielding 
Herbert  Kleinian 
John  A.  McCook 
lvars  Melngailis 
Antonio  F.  Pensa 
John  A.  Tabaczynski 
Lee  O.  Upton 

EDITORIAL  STAFF 

Jack  Nolan,  l  DlioR-iN-t mi  I 

Dan  F'.  Dudgeon,  ta  isi  idiior 

Richard  T.  Lacoss,  on  si  H>|  urn 

Alden  M.  Havashi.  ihiinu  ai  idiior 

Randall  Warniers,  IK  HNU  :ai  IDIIOR 

Leslie  Satford  Spiro,  coi*v  i  di  ior 

Patricia  1..  MacDonald,  subscription  coordinaior 

PRODUCTION  STAFF 

Claude  A.  French,  Jr.,  et  ui  rations  manaoi  k 

Patricia  Kennedy  Graham,  Asst x  i  v n  irw  r  ations  ma\ac,!  k 

Jonathan  C.  Barron,  riltmitaui'iii  K 

Mary  K.  Bourquin,  ixkcmini  t oordinaior 

Kathleen  L.  Coy,  l  CAD  IK  IlNICAl  ARIlsI 

Richard  B.  Doubleday,  technical  artist 

Paula  M.  Gentile,  lit  fiNK  Ai  ariisi 

Allene  T.  Shimomura,  I'ROOI  RI  ADI  K 

THE  uncoi  n  iaroratdrv  JOURNAL  (ISSN  0896-4130)  is  published 
by  1  .coin  laboratory,  Massachusetts  Institute  of  Technology, 

244  Wood  St.,  Lexington,  MA  02173-9108.  Subscriptions  are  tree 
til  charge,  but  provided  only  to  qualified  recipients  (government 
employees  and  contractors,  libraries,  university  faculty,  and  RSi  D 
laboratories).  Requests  for  individual  copies,  subscriptions,  or 
permission  to  reprint  articles  should  be  submitted  to  the  Sub¬ 
scription  Coordinator,  Room  A-222.  at  the  above  address. 

Phone  (617)  981-7026,  or  journalC‘',ll. niit.edu  on  Internet  e-mail. 

The  work  reported  in  this  journal  was  perlomied  at  Lincoln 
laboratory,  a  center  for  research  operated  by  Massachusetts  Institute 
of  Technology.  The  views  expressed  in  this  journal  are  those  ot  the 
authors  and  do  not  reflect  the  official  policy  or  position  of  the 
United  States  Government. 

Postmaster:  Please  send  address  changes  to  the  above  address. 

©199.3  by  Lincoln  laboratory,  Massachusetts  Institute  of 
Technology.  All  rights  reserved. 


DISCLAIM!!  NOTICE 


THIS  DOCUMENT  IS  BEST 
QUALITY  AVAILABLE.  THE  COPY 
FURNISHED  TO  DTIC  CONTAINED 
A  SIGNIFICANT  NUMBER  OF 
COLOR  PAGES  WHICH  DO  NOT 
REPRODUCE  LEGIBLY  ON  BLACK 
AND  WHITE  MICROFICHE. 


Massachusetts  Institute  of  Technology 


LINCOLN  LABORATORY  Journal  SPRING  1993,  VOLUME  6,  NUMBER  1 


3  An  Overview  of  Automatic  Target  Recognition 

Dan  L.  Dudgeon  and  Richard  I.  Lacoss 

In  this  article  wc  introduce  the  subject  ol  automatic  target  recognition  (A  I  R).  Interest  in  AI  R  is  increasing  in  the 
defense  community  as  the  need  for  precision  strikes  in  limited  warfare  situations  becomes  an  increasingly  important 
part  of  our  defense  posture.  We  discuss  the  difficulty  of  the  AI  R  problem  and  we  survey  the  variety  of  approaches  that 
try  to  solve  the  problem.  We  conclude  by  introducing  the  other  articles  in  this  issue  of  the  Lincoln  Laboratory  Journal. 

1 1  Performance  of  a  High-Resolution  Polarimetric  SAR  Automatic  Target  Recognition  System 

Leslie  M.  Novak,  Gregory  J.  Owirka,  and  Christine  Al.  Netishen 

Lincoln  Laboratory  is  investigating  the  detection,  discrimination,  and  classification  of  ground  targets  in  high-resolution, 
fully  polarimetric,  synthetic-aperture  radar  (SAR)  imagery.  This  article  summarizes  our  work  in  SAR  A  I  R  by  discussing 
the  prescreener,  discrimination,  and  classification  algorithms  we  have  developed.  The  prescreener  required  a  low 
direshold  to  detect  most  of  the  targets  in  the  data,  which  rt*»tilrwf  in  a  high  density  of  false  alarms.  The  discriminator 
and  classifier  stages  then  reduced  this  false-alarm  density  by  a  factor  of  100.  We  improved  target-detection  performance 
by  using  fully  polarimetric  imagery  processed  by  the  polarimetric  whitening  filter  (PWL),  rather  than  bv  using  single¬ 
channel  imagery.  The  PWF-processed  imagery  improved  the  probability  of  correct  classification  in  a  four-class  classifier. 

25  Discriminating  Targets  from  Clutter 

Daniel  Li.  Kreithen,  Shawn  D.  Halversen,  and  Gregory  J.  Owirka 

The  Lincoln  Laboratory  multistage  target-detection  algorithm  for  SAR  imagery  can  be  separated  into  three  stages:  the 
prescreener,  the  discriminator,  and  the  classifier.  In  this  article,  we  examine  fifteen  features  that  are  used  in  the 
discrimination  algorithm.  The  set  of  best  features  from  this  pool  of  fifteen  was  determined  by  a  theoretical  analysis,  and 
was  then  verified  by  using  real  SAR  data.  Performance  was  evaluated  for  a  number  of  different  cases;  in  all  cases  the 
theoretical  performance  analysis  closely  matched  the  real  data  performance.  In  addition,  we  formulate  a  set  of  criteria 
for  best  feature  choice  that  apply  to  quadratic  discrimination  algorithms  in  general. 


53  Improving  a  Template-Based  Classifier  in  a  SAR  Automatic  Target  Recognition 
System  by  Using  3-D  Target  Information 

Shawn  M.  Verbout,  William  W.  Irving,  and  Amanda  S.  Hanes 

We  propose  an  improved  version  of  a  conventional  template-matching  classifier  currently  used  in  an  operational  AI  R 
system  for  SAR  imagery.  This  classifier  was  originally  designed  to  maintain  a  library  of  2-D  reference  images  formed  at  a 
variety  of  radar  viewing  directions.  The  classifier  accepts  an  input  image  of  a  target  of  unknown  type,  correlates  this 
image  with  a  reference  template  selected  from  each  target  library,  and  then  classifies  this  image  to  the  target  category 
with  the  highest  correlation  score.  The  algorithm  produces  surprisingly  poor  classification  results  for  some  target  types, 
however,  because  of  differences  in  SAR  geometry  between  the  input  image  and  the  best-matching  reference  image.  We 
correct  this  deficiency  by  incorporating  a  model-based  reference  generation  procedure  into  the  original  classifier. 


77 


Neural  Systems  for  Automatic  Target  Learning  and  Recognition 

Allen  M.  Waxman,  Michael  Seibert,  Ann  Marie  Bernardon,  and  David  A.  Lay 

We  have  designed  and  implemented  several  computational  neural  systems  for  the  automatic  learning  and  recognition  of 
targets  in  both  passive  visible  and  synthetic-aperture  radar  (SAR)  imagery.  Motivated  by  biological  vision  systems  (in 
particular,  that  of  the  macaque  monkey),  our  computational  neural  systems  employ  a  variety  of  neural  networks.  In  this 
article  we  present  an  overview  of  our  research  for  the  past  several  years,  highlighting  our  earlier  work  on  the 
unsupervised  learning  of  three-dimensional  objects  as  applied  to  aircraft  recognition  in  the  passive  visible  domain,  the 
recent  modification  of  this  system  with  application  to  the  learning  and  recognition  of  tactical  targets  from  SAR  imagery, 
the  further  application  of  this  system  to  reentry- vehicle  recognition  from  inverse  SAR,  or  1SAR,  imagery,  and  the 
incorporation  of  this  recognition  system  on  a  mobile  robot  called  the  Mobile  Adaptive  Visual  Navigator  (MAVIN)  at 
Lincoln  Laboratory.  ."C.  lGS 


ime  QTTAj.r-T  — r.n_n 


Gist 


/)■' 


/-SL'.iit  ;  or 

iv.-Cn.’! 


ii7  Multidimensional  Automatic  Target  Recognition  System  Evaluation 

Paul J.  kolodzy 

We  arc  developing  an  evaluation  facility  that  includes  an  electronic  terrain  hoard  (FTB)  to  provide  an  effective  test 
environment  tor  AI  R  systems.  The  input  to  the  FIB  is  very  high-resolution  data  taken  in  the  modalities  of  interest 
(lasei  radar,  passive  IR,  and  visible).  The  F  I  B  contains  sensor  and  target  models  so  that  measured  imagery  can  he 
modified  for  sensitivity  analyses.  The  evaluation  facility  also  contains  a  reconfigiirabic  suite  of  AI  R  algorithms  that  can 
he  interfaced  to  real  and  synthetic  data  for  developing  and  testing  AI  R  modules.  This  article  presents  a  description  of 
the  infrared  airborne  radar  used  to  gather  sensor  data,  a  discussion  of  sensor  fusion  and  the  hybrid  A I  R  measurement 
system,  and  a  review  of  the  ATR  evaluation  facility.  We  give  results  of  processing  real  and  synthetic  imagery  with  the 
AI  R  system,  with  an  emphasis  on  interpreting  results  with  respect  to  sensor  design. 

147  An  Efficient  MRF  I  mage- Restoration  Technique  Using  Deterministic 
Scale-Based  Optimization 

Murali  M.  Menon 

A  method  for  performing  piecewise  smooth  restorations  on  images  corrupted  with  high  levels  of  noise  has  been 
developed.  Based  on  a  Markov  Random  Field  (MRF)  model,  the  method  uses  a  neural  network  sigmoid  nonlinearity 
between  pixels  in  the  image  to  produce  a  restoration  with  sharp  boundaries  while  providing  noise  reduction.  I  he  model 
equations  are  solved  with  the  Gradient  Descent  Gain  Annealing  (GDGA)  method — an  efficient  deterministic  search 
algorithm  that  typically  requires  fewer  than  200  iterations  for  image  restoration  when  implemented  as  a  digital 
computer  simulation.  A  novel  feature  of  the  GDGA  method  is  that  it  automatically  develops  an  annealing  schedule  by 
adaptively  selecting  the  scale  step  size  during  iteration. 

161  Machine  Intelligent  Automatic  Recognition  of  Critical  Mobile  Targets  in  Laser  Radar  Imagery 

Richard  L.  Dclauoy,  Jacques  G.  Verly,  and  Dan  E.  Dudgeon 

A  variety  of  machine  intelligence  (MI)  techniques  have  been  developed  at  Lincoln  Laboratory  to  increase  the 
performance  reliability  of  automatic  target  recognition  (AIR)  systems.  Useful  tor  recognizing  targets  that  are  only 
marginally  visible  (due  to  sensor  limitations  or  to  the  intentional  concealment  of  the  targets),  these  Ml  techniques  have 
become  integral  parts  of  the  Kxperimental  Target  Recognition  System  (XTRS) — a  general-purpose  system  for  model- 
based  ATR.  Using  laser  radar  images  collected  by  an  airborne  sensor,  the  prototype  system  recognized  a  variety  of  semi¬ 
trailer  trucks  with  high  reliability,  even  though  the  trucks  were  deployed  in  high-clutter  environments. 

187  Machine  Intelligent  Gust  Front  Detection 

Richard  I..  Delanoy  and  Seth  W.  I  Voxel 

Techniques  of  low-level  machine  intelligence,  originally  developed  at  Lincoln  Laboratory  to  recognize  military  ground 
vehicles  obscured  by  camouflage  and  foliage,  are  being  used  to  detect  gust  fronts  in  Doppler  weather  radar  imagers. 

This  Machine  Intelligent  Gust  Front  Algorithm  (MiGFA)  is  part  of  a  suite  of  hazardous-weather-detection  functions 
being  developed  under  contract  with  the  Federal  Aviation  Administration.  MIGFA  has  demonstrated  levels  of  detection 
performance  that  have  not  only  markedly  exceeded  the  capabilities  of  existing  gust  front  algorithms,  but  that  are 
competitive  svith  human  interpreters. 

213  Extracting  Target  Features  from  Angle-Angle  and  Range-Doppler  Images 

Sit  May  Hsu 

For  diffuse  targets,  features  such  as  shape,  size,  and  motion  can  be  determined  from  a  time  series  of  images  from  either 
angle-angle  passive  telescopes  or  range-Doppler  radars.  The  extracted  target  features  can  then  be  used  tor  automated 
target  recognition  and  identification.  An  algorithm  that  uses  scene-analysis  techniques  has  been  developed  to  perform 
the  feature  extraction. 


An  Overview  of  Automatic 
Target  Recognition 

Dan  E.  Dudgeon  and  Richard  T.  Lacoss 

■  In  this  article  we  introduce  the  subject  of  automatic  target  recognition 
(ATR).  Interest  in  ATR  is  increasing  in  the  defense  community  as  the  need  for 
precision  strikes  in  limited  warfare  situations  becomes  an  increasingly  important 
part  of  our  defense  posture.  We  discuss  reasons  for  the  difficulty  of  the  ATR 
problem  and  we  survey  the  variety  ol  approaches  that  try  to  solve  the  problem. 

We  conclude  by  introducing  the  remaining  articles  in  this  special  issue  of  the 
Lincoln  Laboratory  Journal. 


Automatic  target  recognition  (ATR)  gen¬ 
erally  refers  to  the  use  of  computer  pro¬ 
cessing  to  detect  and  recognize  target  signa¬ 
tures  in  sensor  data.  The  sensor  data  are  usually  an 
image  from  a  forward-looking  infrared  (FUR)  cam¬ 
era,  a  synthetic-aperture  radar  (SAR),  a  television  cam¬ 
era,  or  a  laser  radar,  although  A  I  R  techniques  can  be 
applied  to  non-imaging  sensors  as  well.  ATR  has  be¬ 
come  increasingly  important  in  modern  defense  strat¬ 
egy  because  it  permits  precision  strikes  against  certain 
tactical  targets  with  reduced  risk  and  increased  effi¬ 
ciency,  while  minimizing  collateral  damage  to  other 
objects.  If  computers  can  be  made  to  detect  and 
recognize  targets  automatically,  the  workload  of  a 
pilot  can  be  reduced  and  the  accuracy  and  efficiency 
of  the  pilot’s  weapons  can  be  improved. 

AI  R  technology  can  also  be  applied  to  non-mili¬ 
tary  problems  as  well.  For  example,  the  problem  of 
recognizing  landmarks  seen  bv  a  visual  navigation 
system  or  a  robotic  system  is  related  to  the  ATR 
problem.  The  recognition  of  particular  objects  or  faces 
in  photographs  or  video  sequences  is  also  related  to 
ATR.  We  can  think  of  the  A  I  R  problem  as  one  part 
of  the  general  problem  of  machine  vision;  namely, 
how  can  computers  be  made  to  do  what  we  humans 
do  so  easily  and  naturally? 

The  fundamental  problem  of  A  I  R  is  to  detect  and 
recognize  objects  of  interest  (targets)  in  an  environ¬ 
ment  of  clutter  imaged  by  an  imperfect  sensor  that 
introduces  noise  into  the  resulting  signal.  The  defini¬ 


tions  of  target,  noise,  and  clutter  depend  upon  the 
application.  Target  categories  can  be  coarse  (e.g.,  a 
treaded  ground  vehicle)  or  fine  (e.g.,  a  specific  type  of 
tank  or  even  a  specific  tank).  Often  the  term  classifica¬ 
tion  is  used  for  coarse  categorization  and  the  term 
identification  is  used  for  fine  categorization,  although 
they  are  also  used  synonymously  with  the  term  recog¬ 
nition.  Unfortunately,  usage  is  not  consistent. 

Clutter  refers  to  real  things  that  are  imaged  (build¬ 
ings,  cars,  trucks,  grass,  trees,  and  other  objects)  but 
are  not  targets  of  interest.  Sometimes  a  distinction  is 
made  between  naturally  occurring  clutter  (grass,  trees, 
topographical  features)  and  man-made  cultural  clut¬ 
ter  (buildings,  vehicles,  and  other  works).  Clutter 
tends  to  dominate  the  imagery  simply  because  targets 
are  generally  sparse  compared  to  the  environment  in 
which  they  operate.  Noise  refers  to  electronic  noise  in 
the  sensor  as  well  as  inaccuracies  introduced  in  the 
computations  by  a  signal  processor.  Depending  on 
the  A1  R  application,  the  problem  may  be  one  of 
extracting  a  signal  from  noise  or  it  may  be  one  of 
separating  a  target  from  its  surrounding  clutter. 

The  distinction  between  detection  and  recognition 
is  ill  defined.  We  could  argue  that  recognition  is  just 
the  detection  of  a  specific  target  type.  But  algorithms 
developed  from  this  viewpoint  tend  to  require  pro¬ 
hibitive  amounts  of  processing.  For  this  reason,  AI  R 
systems  generally  include  a  front-end  detection  stage. 
The  goal  of  the  detection  stage  is  to  eliminate  most  of 
the  sensor  data  from  further  consideration  without 


•  IH’IM.FON  AM)  IA(  OSS 

.In  (iriirnlt  nf  AiilnniJtl,  Ut\ii”Hllloi 


Detection 


Recognition 


Sensor  data 
containing  potential 
targets 


Target  list 


Background 

clutter 


Non-target 

clutter 


FIGURE  1.  Conceptual  data  flow  in  automatic  target  recognition  (ATR)  systems.  Simple  detection  algorithms 
are  applied  to  all  the  sensor  data  to  isolate  small  portions  that  might  contain  targets.  More  complex  recogni¬ 
tion  algorithms  then  process  the  selected  portions  of  the  data  to  reject  11011-target  clutter  and  classify  targets. 
Ideally,  all  targets  of  interest  pass  through  the  pipeline  and  are  included  in  the  output  target  list. 


eliminating  any  of  the  targets  of  interest.  In  this  con¬ 
text,  the  term  detection  means  that  something  inter¬ 
esting  has  been  discovered,  and  this  discovery  requires 
further  analysis.  For  example,  a  small  cluster  of  bright 
pixels  in  an  image  could  indicate  the  presence  ol  an 
object.  ( iompurarionallv  simple  detection  algorithms 
are  required  at  this  stage  because  all  the  sensor  data  in 
the  input  image  must  be  examined. 

Practical  implementations  of  A  I  R  systems  can  be 
viewed  as  pipeline  processing  systems,  as  illustrated  in 
Figure  1.  Ideally,  all  targets  of  interest  pass  through 
the  pipeline  and  are  included  in  the  output  target  list. 
As  data  move  through  the  pipeline,  the  processing 
algorithms  become  more  target  specific  and  compu¬ 
tationally  intensive  for  each  data  item,  while  the  num¬ 
ber  of  data  items  processed  and  the  number  of  clutter 
false  alarms  each  decreases.  F.ven  with  this  structure, 
the  front-end  detection  stages  of  the  processing  pipe¬ 
line  often  require  the  most  computational  power  be¬ 
cause  the  APR  system  must  search  large  amounts  of 
imagers'  to  find  a  few  instances  of  the  target. 

For  a  specific  AI  R  problem,  both  the  target  signa¬ 
tures  and  the  clutter  background  can  varv.  In  thermal 
images,  for  example,  a  tank  can  be  hotter  or  colder 
than  its  background  (causing  positive  or  negative  ther¬ 
mal  contrast).  It  can  also  exhibit  one  shape  when 
viewed  from  the  side  and  another  shape  when  viewed 
from  the  front;  its  turret  can  be  rotated  to  any  posi¬ 
tion  and  its  gun  barrel  can  take  on  a  range  of  elevation 
.unties.  I  he  background  clutter  can  be  a  benisin 


meadow,  a  treeline,  a  forest  road,  a  featureless  desert, 
or  an  urban  area.  Such  complicated  variations  in  both 
target  signature  and  background  clutter  contribute 
significantly  to  the  difficulty  of  the  A  I  R  problem. 

ATR  Technology 

Many  technologies  and  techniques  are  utilized  in  at¬ 
tempting  to  solve  the  problems  of  AI  R,  as  illustrated 
in  Figure  2.  Sensor  technology  is  critical  because  per¬ 
formance  is  ultimately  limited  by  the  quality  of  the 
information  provided  by  the  sensors.  Processing  hard¬ 
ware  is  also  critical  because  AI  R  algorithms  must 
process  large  quantities  of  data,  often  in  real  time,  and 
because  system  development  can  be  significantly  hin¬ 
dered  hv  the  lack  of  processing  capacity.  Software, 
simulation,  and  evaluation  methodologies  and  tools 
are  also  important  elements.  In  addition,  important 
but  indirect  contributions  are  made  by  neural  and 
cognitive  sciences,  statistics,  and  sensor  physics. 

Figure  2  also  identifies  several  algorithmic  ap¬ 
proaches  to  the  problem  of  AI  R;  the  multiplicity  of 
these  approaches  indicates  that  no  satisfactory  single 
approach  has  yet  been  found.  I  he  most  successful 
AI  R  systems  will  probably  blend  several  algorithmic 
techniques  to  get  satisfactory  performance. 

Detection  Theory 

A  I  R  is  based  in  part  on  detection  theory  and  related 
statistical  ideas  that  date  back  to  the  early  days  of 
radar  processing.  If  a  target  signature  is  deterministic 


•  DUDGEON  AND  LACOSS 

An  Overview  of  Automatic  Target  Recognition 


or  if  it  is  stochastic  with  well-defined  stationary  statis¬ 
tics,  and  it  the  clutter  is  well  characterized  as  a  station¬ 
ary  random  process,  then  optimal  detection  tech¬ 
niques  such  as  the  matched  filter  can  be  derived.  In 
classical  detection-theory  problems,  a  trade-otf  exists 
between  the  probability  ot  detecting  a  target  signature 
when  the  target  is  there,  and  the  probability  ot  declar¬ 
ing  that  a  target  is  present  when  in  fact  it  is  not  there 
(i.e.,  a  false  alarm).  The  objective  ot  optimal  detection 
processing  is  to  separate  the  distributions  ot  target 


signatures  and  clutter  signatures  so  that  they  can  be 
distinguished  by  a  simple  statistical  test. 

A  seductive  aspect  ot  the  detection-theory  approach 
is  that  it  provides  a  firm  theoretical  foundation  for 
both  the  development  ot  algorithms  and  the  under¬ 
standing  of  their  performance.  This  approach  to  ATR, 
however,  requires  valid  and  analytically  (or  computa¬ 
tionally)  tractable  statistical  models  tor  both  targets 
and  clutter.  Such  models  are  difficult  to  develop,  and 
this  approach  is  greatly  complicated  by  signature  and 


FIGURE  2.  ATR  technologies  and  processing  methods.  The  circles  identify  some  of  the  technologies  and  methods  that  are 
required  to  solve  difficult  ATR  problems,  although  not  all  technologies  and  processing  methods  are  used  in  all  cases.  The 
three  boxes  show  some  of  the  basic  supporting  research  activities. 


5 


•  DUDGEON  AND  ljVtOSS 

An  Oivrrit'U'  of  Automatn  l argil  Rirngiiuwn 


< 


clutter  variability.  Detection  theory  is  conceptually 
appealing,  but  it  has  had  only  limited  success  tor  AI  R 
problems. 

Pattern  Recognition 

Pattern  recognition  is  the  most  mature  approach  used 
tor  A  PR  applications.  Target-signature  representation 
options  range  trom  two-dimensional  image  templates 
to  lower-dimensional  vectors  ot  features  that  are  de¬ 
signed  to  be  differentially  sensitive  to  targets  and  non¬ 
targets.  Recognition  depends  on  feature  vectors  (or 
templates  thought  of  as  vectors)  for  targets  clustering 
together  and  being  distinct  trom  non-target  clusters. 

Potential  targets  (detections)  are  confirmed  by  com¬ 
paring  target  images  or  feature  vectors  with  a  database 
of  target  and  non-target  exemplars.  Recognition  con¬ 
sists  of  selecting  the  best  match  between  the  target 
data  and  the  exemplar  database.  The  matching  crite¬ 
ria  may  be  ad  hoc  (e.g.,  mean-square  differences  be¬ 
tween  data  and  exemplar  vectors),  or  they  may  be 
based  on  statistical  assumptions  that  give  the  appear¬ 
ance  of  a  more  rational  basis. 

Pattern  recognition  is  not  disjoint  from  the  detec¬ 
tion-theory  approaches  discussed  above  or  from  the 
artificial  neural  network  approaches  discussed  below. 
A  close  functional  similarity  exists  between  template 
(or  feature  vector)  matching  and  matched  filtering; 
differences  between  the  two  are  at  a  more  detailed 
level.  Detection  theory  emphasizes  optimal  (or  near- 
optimal)  algorithms  that  are  derived  by  using  statisti¬ 
cal  models  of  raw  data.  Pattern  recognition,  which 
also  relies  heavily  on  statistics,  includes  more  ad  hoc 
approaches  (e.g..  spectral  coefficients,  fractal  dimen¬ 
sions,  and  blob  aspect  ratios),  especially  in  the  defini¬ 
tion  and  extraction  of  the  features  used  to  character¬ 
ize  targets. 

Artificial  Neural  Networks 

T  he  neural  network  approach  places  even  more  em¬ 
phasis  upon  experiential  learning-by-example  than 
does  the  pattern  recognition  approach.  Neural  net¬ 
works  represent  a  processing  paradigm  motivated  by 
the  human  visual  system,  which  remains  the  most 
flexible  and  robust  target-recognition  system  for  im¬ 
agery  that  we  know.  The  goal  of  neural  network  ap¬ 
proaches  is  to  develop  an  architecture  that  reproduces 


the  flexibility  and  robustness  of  the  human  visual 
system  in  a  piece  of  equipment. 

In  the  nearer  term,  neural  networks  can  have  imple¬ 
mentation  advantages  because  they  are  highly  paral¬ 
lel,  and  their  ability  to  learn  by  example  can  make 
them  capable  of  discovering  and  using  signature  dif¬ 
ferences  that  distinguish  different  types  of  targets. 
Neural  networks  are  a  type  of  nonlinear  processing 
that  could  have  advantages  over  classical  detection- 
theoretic  techniques  or  pattern-recognition  techniques. 
One  challenge  for  neural  network  approaches  is  to 
achieve  good  performance  over  the  entire  range  of 
target  signatures  and  background-clutter  conditions, 
given  limited  training  data. 

Neural  network  learning  can  be  categorized  as  su¬ 
pervised  or  unsupervised.  Supervised  networks  are 
trained  bv  using  independently  classified  images  or 
feature  vectors.  Training  typically  consists  of  comput¬ 
erized  adjustment  of  parameters  to  optimize  perfor¬ 
mance  on  the  training  set.  Performance  on  new  data 
depends  on  how  well  the  training  data  are  classified 
and  how  representative  they  are.  IJnsupervised  net¬ 
works  define  their  own  internal  classifications  of  data, 
independent  of  the  external  source.  Data  are  clustered 
together  if  they  look  similar  to  other  members  ot  the 
cluster.  When  new  data  are  not  similar  enough  to  any 
existing  cluster,  a  new  cluster  is  formed. 

Unsupervised  networks,  however,  ultimately  require 
supervised  training  to  perform  useful  recognition.  A 
target  class  must  be  associated  with  each  internal 
cluster  learned  by  the  network.  Typically,  a  cluster  is 
assigned  the  target  name  corresponding  to  the  major¬ 
ity  (or  even  plurality)  of  training  examples  in  the 
cluster.  This  training  is  often  an  iterative  process  in 
which  parameters  of  the  unsupei vised  network  are 
adjusted  by  the  AI  R  algorithm  developer  until  each 
of  its  clusters  tends  to  be  dominated  by  only  one 
target  type.  Multiple  clusters  of  data  can  be  assigned 
the  same  target  type. 

Model-Based  Parget  Recognition 

The  genesis  of  model-based  target-recognition  meth¬ 
ods  is  in  artificial  intelligence  as  it  is  applied  to  image 
understanding.  Two  characteristics  typify  the  model- 
based  approach:  (1)  matching  of  processed  sensor 
data  to  predictions  based  on  hypotheses  concerning 


6 


•  DUDGEON  AND  LACOSS 

An  Overview  of  Automatic  target  Recognition 


the  target  rype,  po;>e,  and  rungs,  and  (2)  matching  of 
processed  sensor  data  on  the  basis  of  multiple  local¬ 
ized  features. 

The  difference  between  model-based  target  fea¬ 
tures  and  the  feature  vectors  of  other  approaches  is 
important.  The  elements  of  feature  vectors  are  not 
features  in  the  model-based  sense.  Corners,  bright 
points,  line  segments,  and  small  regions  correspond¬ 
ing  to  identifiable  parts  of  a  target  (e.g.,  the  gun, 
turret,  or  body  of  a  tank)  might  be  features  for  a 
model-based  system.  Recognition  depends  on  match¬ 
ing  parts  and  interrelationships  between  the  parts. 
Flexible  and  sophisticated  matching  techniques  can 
be  used,  in  principle,  to  design  systems  that  are  more 
robust  with  regard  to  obscuration  and  target  variabil¬ 
ity.  In  contrast,  the  elements  of  feature  vectors  arc 
usually  global  target  measures  such  as  the  coefficients 
of  a  transform  that  represents  the  target  image  or  a 
texture  measure  such  as  fractal  dimension. 

In  some  sense,  every  ATR  algorithm  is  model  based 
because  every  algorithm  makes  and  uses  a  priori  as¬ 
sumptions  about  target  and  clutter  characteristics. 
The  representation  of  the  a  priori  information  can 
vary  widely,  however,  and  it  is  conceptually  different 
for  model-based  and  non-model-based  approaches. 
Generally,  in  model-based  target  recognition  the  ap¬ 
pearance  of  the  target  in  an  image  is  modeled. 

The  search-and-match  approaches  of  model-based 
methods  also  tends  to  distinguish  them  from  other 
algorithmic  approaches.  The  general  paradigm  is  as 
follows:  first  form  an  initial  set  of  hypotheses  based 
on  the  sensor  data  (the  indexing  problem),  then  use 
the  hypotheses  to  predict  features  and  their  relation¬ 
ships,  and  finally  compare  the  predictions  to  features 
extracted  from  the  data.  This  approach  is  quite  differ¬ 
ent  from  conducting  a  computationally  intensive  ex¬ 
haustive  search  for  a  best  matching  pattern. 

Continuing  areas  of  research  in  model-based  target 
recognition  include  matching  and  evidence  accumu¬ 
lation  (i.e.,  what  is  the  most  robust  way  to  match 
hypothesized  signatures  to  the  sensor  data),  indexing 
(how  to  generate  a  small  number  of  hypotheses  that 
contain  the  correct  target),  and  modeling.  In  some 
systems  the  models  are  built  by  hand;  that  is,  analysts 
examine  sensor  data,  observe  target  signature  charac¬ 
teristics,  and  encode  them  into  some  form  of  data 


representation  that  is  convenient  for  matching.  This 
approach  is  time  consuming,  and  consequent  there 
is  a  need  to  build  models  automatically  from  data. 
For  example,  the  radar  image-understanding  system 
discussed  in  an  earlier  article  in  the  Lincoln  Laboratory 
Journal  can  build  target  models  of  reentry  vehicles 
from  radar  data  [  1  ].  (Note  that  a  system  does  not  have 
to  be  a  neural  network  to  be  capable  of  learning  from 
data.) 

Data  Requirements  for  A  TR  System  Development 

A  I  R  system  development  and  evaluation  requires  an 
enormous  quantity  of  data  because  of  the  variability 
in  target  signatures  and  background  clutter.  The  as¬ 
semble  of  large,  realistic,  experimental  databases,  how  ¬ 
ever,  is  time  consuming  and  expensive.  As  a  result,  we 
need  to  develop  techniques  that  minimize  data  re¬ 
quirements  or,  to  put  it  differently,  utilize  experimen¬ 
tal  data  more  effectively.  Simulation  is  one  such  ap¬ 
proach;  limited  experimental  data  can  be  used  to 
develop  and  validate  simulation  models  that  can  then 
be  used  to  generate  data  for  system  development  and 
evaluation. 

If  the  statistics  of  a  target  signature  can  be  modeled 
parametrically,  then  a  small  amount  of  experimental 
data  might  be  used  to  determine  the  model  param¬ 
eters.  Another  approach  is  to  develop  physical  models 
of  image  formation  and  use  them  to  create  synthetic 
target  signatures  in  both  real-clutter  and  synthetic- 
clutter  backgrounds.  This  approach  doesn't  reduce 
the  need  for  data;  it  simply  allows  us  to  create  the  data 
in  a  computer  rather  than  collect  real  data  with  a  real 
sensor  observing  real  targets  in  real  backgrounds  in  all 
their  various  states.  Of  course,  the  synthetic  imagery 
rr>"st  be  realistic  to  be  useful,  and  realism  is  deter¬ 
mined  by  comparing  the  synthetic  data  to  a  real  data¬ 
base.  Fortunatelv,  both  model  development  and  svn- 
thetic  scene  generation  are  done  off  line,  so  that 
real-time  performance  is  not  required.  But  the  issue 
ol  how  to  develop  and  evaluate  ATR  systems  in  a 
reasonable  amount  of  time  with  a  reasonable  amount 
of  data  is  still  open. 

In  This  Issue 

We  are  fortunate  at  Lincoln  Laboratory  to  have  sev¬ 
eral  state-of-the-art  sensors  for  collecting  data  that 


•  in  IK.KON  AM)  U( OSS 

I  >1  (hi*l!i’U  ft  ,1  [diiifi.ii  it  I,t>"  {[  fit « ii"  ta!  iiH, 


can  Ik  used  to  develop  l,uj;el  deice  (ion  .mil  recogni¬ 
tion  svstems.  I  he  in  house  av ailabih; .  it  these  sen- 
sots  is  reflected  m  the  .tnteles  wjt.ued  in  this  issue. 
I  lie  articles  eniphasi/e  resear*  n  mat  utilizes  data  from 
sviithetic -aperture  radars,  laser  radars,  range- Doppler 
imaging  radars,  am!  air  surveillance  radars.  I  sen  when 
the  work  has  ia,i  involved  such  sophisticated  sensors, 
u  is  nevertheless  experimentally  oriented  and  it  em¬ 
phasizes  the  ii'C  of  real  sensor  data.  Realistic  experi¬ 
mental  data  stimulate  the  work  and  provide  a  rich 
source  of  information  tor  understanding  and  exploit 
mg  how  targets  and  the  clutter  backgrounds  in  which 
tliex  are  embedded  can  he  modeled  .mil  separated. 

In  this  issue,  we  have  collected  nine  other  articles 
detailing  various  aspects  of  A I  R.  W  e  begin  bv  focus¬ 
ing  on  polarimetric  svnthetic-aperture  radar  imagerv 
and  the  techniques  used  to  detect,  discriminate,  and 
classify  targets  found  in  such  imagery.  The  article  by 
I  eslie  \1.  Novak  et  al.  gives  an  overv  iew  of  a  classical 
pattern-recognition  approach.  Die  article  by  Daniel 
I  .  Kreithen  et  at.  discusses  the  problem  of  discrimi¬ 
nating  targets  from  the  natural-clutter  objects  (e.g.. 
trees  and  bushes)  that  reflect  enough  radar  energy  to 
pass  the  detection  stage.  I  he  article  by  Shawn  M. 
Vcrbout  et  al.  investigates  some  of  the  problems  in 
recognizing  three-dimensional  targets  from  their  two- 
dimensional  SAR  signatures,  and  develops  a  tech¬ 
nique  tor  incorporating  three-dimensional  informa¬ 
tion  in  the  classification  process. 

The  article  by  .Alien  M.  Waxman  et  al.  introduces 
the  topic  of  neural  network  recognition  svstems.  It 
discusses  some  of  the  motivation  tor  neural  network 
architectures  that  are  derived  from  biological  vision 
svstems.  It  also  presents  the  idea  that  information  car. 
be  extracted  from  the  changes  in  target  signature 
from  one  viewpoint  to  the  next  as  a  target  and  a 
sensor  move  past  each  other.  I  hese  neural-processing 
concepts  are  demonstrated  on  several  different  target- 
recognition  problems. 

I  he  theme  of  neural  networks  continues  in  articles 
bv  Paul  ).  Kolod/v  and  bv  Murali  M.  Mcnon.  I  lie 
article  bv  Kolod/v  describes  laser  radar  imagerv,  w  hich 
simultancouslv  captures  reflect iv itv  information  and 
range  information.  I  he  article  also  addresses  the  prob¬ 
lem  of  evaluating  Al  R  algorithms.  Because  of  the 
cost  of  collecting  accurate  training  and  test  imagerv 


spanning  the  breadth  of  possible  A  I  R  scenarios,  the 
development  of  an  electronic  terrain  board  (which  is  a 
system  that  can  generate  sv  iithetic  imagerv  .icctiratelv 
and  quick!)  I  for  testing  and  evaluating  AIR  algo¬ 
rithms  m.iv  be  more  cost  effective. 

the  article  bv  Menon  discusses  the  use  of  neural 
networks  to  implement  an  image-enhancement  al¬ 
gorithm,  based  on  Markov  random  fields,  tor  laser 
radar  imagerv.  I  he  effect  of  this  preprocessing  step 
on  the  performance  of  the  target  classifier  is 
demonstrated. 

I  he  article  bv  Richard  I  .  Dclanov  et  al.  also  dem¬ 
onstrates  the  use  of  laser  radar  measurements  tor  tar¬ 
get  recognition,  but  here  the  authors  use  a  model- 
based  approach.  Detection  is  done  bv  several 
algorithms  that  operate  in  parallel  and  are  keved  to 
look  tor  various  image  features  bv  using,  a  procc  .s 
called  fitiutioiiii/  tnnp/.ite  lom-Ltnou.  Multiple  func¬ 
tional-template  outputs  are  combined  to  form  a  simple 
interest  image  to  focus  subsequent  processing,  larget 
information  is  extracted  and  decomposed  into  tea 
tures  that  are  matched  against  stored  appearance  mod¬ 
els.  I  he  weights  used  to  govern  the  matching  process 
can  be  adaptively  learned  on  training  data.  The  ap¬ 
pearance  models  themselves  can  also  be  built  semi- 
auiomatieallv  from  training  data. 

I  he  article  bv  I  Vl.mov  and  Seth  W .  1  roxel  demon¬ 
strates  the  flex  ibi  I  itv  and  power  of  functional  tem¬ 
plate  correlation.  1  his  article  discusses  how  functional 
templates  are  used  to  detect  and  recognize  gust  fronts 
in  weather  radar  data.  I  he  algorithm  was  recently 
implemented  in  real  time  and  demonstrated  on  (lie 
ASR-1)  surveillance  radar  svstem  at  Orlando  Interna¬ 
tional  Airport  m  Honda. 

I  he  issue  concludes  with  an  article  bv  Su  Mac  I  Isu 
on  research  in  ballistic  missile  defense.  I  his  article 
discusses  the  use  of  image  processing  techniques  to 
extract  target  features  tor  classification  of  reentering 
objects. 

The  Future  of  ATR 

A  I  R  continues  to  be  an  important  defense  technol- 
ogv.  W  hile  great  progress  has  been  made  in  sensor 
technology  and  processing-hardware  technologv.  the 
development  of  recognition  algorithms  is  far  from 
complete.  1  fforts  are  continuing  to  accelerate  the  de¬ 


ft 


•  DUDGEON  AND  LACOSS 

An  ( h  t'rt  Jt  w  of  AutinnjUi  R'io^umon 

velopment  of  these  algorithms.  I  hese  efforts  are  spon¬ 
sored  by  ARl’A,  the  Armv  Night  Vision  Klcctronic 
Sensors  Directorate  (formerly  the  Night  Vision  Lab), 
the  Air  force  Wright  Laboratory,  the  Naval  Air  War¬ 
fare  (  enter  (C  hina  1  akc).  and  other  government  labo¬ 
ratories.  C  'ommon  databases  and  evaluation  techniques 
are  being  developed  so  that  AI  R  approaches  can  be 
objectively  compared. 

Because  of  the  amount  of  data  required  to  develop 
and  test  A  I  R  systems  for  robustness,  and  the  cost  of 
mounting  realistic  data  collection  efforts,  the  devel¬ 
opment  of  accurate  synthetic  data-generation  tools  is 
also  being  addressed.  Organizations  such  as  the  AI  R 
Working  Ciroup  (a  joint  industry-government  work¬ 
ing  group)  and  the  Department  of  Defense  Working 
Ciroup  on  A  I  R  bring  together  government,  industry, 
and  academic  laboratories  to  exchange  views  on  the 
problems  and  potential  solutions  in  A  I  R  technology. 

Just  as  the  speech-recognition  problem  required 
decades  of  serious  research  before  simple  applications 
could  be  brought  to  market,  the  AIR  problem  needs 
a  wide  range  of  research  before  we  can  expect  practi¬ 
cal  systems.  Most  people  believe  the  AI  R  problem 
will  not  be  solved  by  a  single  brilliant  idea.  The 
solution  will  probably  require  a  combination  of  im¬ 
proved  sensors,  faster  computers,  and  better  algo¬ 
rithms.  From  our  point  of  view,  the  area  is  extremely 
interesting  because  it  involves  so  many  technical  dis¬ 
ciplines  (material  science,  sensor  technology,  signal 
and  image  processing,  detection  theory,  algorithm 
development,  computer  technology,  data  representa¬ 
tion,  pattern  classification,  and  neural  networks)  and 
because  it  has  so  many  potential  applications. 

REFERENCES 

1.  A.M.  Anil,  R.A.  (iabel,  and  I  .].  (ioblick.  “Real- 1  ime  Radar 
linage  Understanding:  A  Machine  Intelligent  Approach,  /  im\ 

Ink  f.  5.  195  (ll)92). 


•  DUDGKON  AND  LACOSS 

An  Overview  of  Automatic  target  Recognition 


DAN  K.  DUDGKON 
is  a  senior  suit  member  in  the 
Machine  Intelligence  Technol¬ 
ogy  group.  His  research  speci¬ 
alities  are  in  image  processing, 
computer  vision,  and  computer 
architectures  for  algorithm 
development  and  real-time 
implementations.  Before  join¬ 
ing  Lincoln  l  aboratory  in 
1 9~9,  he  worked  at  Bolt. 
Beranek,  and  Newman,  Inc.  in 
Cambridge,  Massachusetts. 

Dan  received  S.B.,  S.M..  and 
L.L.  degrees  in  electrical  engi¬ 
neering,  and  a  Sc.D.  degree  in 
signal  processing,  all  from  MIT. 
In  1976  he  was  corecipient  of 
II  PL's  Browder J.  Thompson 
Memorial  Prize  for  best  paper 
by  an  author  under  the  age  of 
30.  He  is  the  coauthor  of 
MultiiPmieusional  Digital  Signal 
Processing  and  Array  Signal 
Processing.  Dan  was  elected  an 
I l  l  l;  bellow  in  1987,  and  he 
was  named  a  Distinguished 
l  ecturer  of  the  ASSP  Society  in 
1 988.  I  le  is  currently  taking  ice 
skating  lessons  in  a  futile  at¬ 
tempt  to  keep  pace  with  his 
daughter's  athletic  endeavors. 


RICHARD  r.  I.ACOSS 
is  the  leader  of  the  Machine 
Intelligence  Technology  group. 
He  received  a  B.A.  degree  and  a 
B.S.  degree  in  electrical  engi¬ 
neering  from  Columbia  Uni¬ 
versity.  and  a  Ph.D.  degree  in 
electrical  engineering  from  the 
University  of  California  at 
Berkeley.  Prom  1963  to  1978 
he  worked  in  seismic  discrimi¬ 
nation  and  signal  processing 
research  at  Lincoln  Laboratory. 

I  lis  research  now  includes 
image  understanding,  expert 
systems,  seismo-acoustic  sur¬ 
veillance,  neural  networks,  and 
parallel  processing.  He  has 
taught  in  both  the  Larth  and 
Planetary  Sciences  Department 
and  the  Llectrical  Lngineering 
Department  at  MIT.  and  he  is 
the  author  of  many  articles  and 
reports  in  seismology  and  signal 
processing.  His  two-year  old 
son  and  his  three- vear-old 
daughter  also  keep  him  very 
busy  these  days. 


Performance  of  a 
High-Resolution  Polarimetric 
SAR  Automatic  Target 
Recognition  System 

Leslie  M.  Novak,  Gregory  J.  Owirka,  and  Christine  M.  Netishen 

■  Lincoln  Laboratory  is  investigating  the  detection,  discrimination,  and 
classification  of  ground  targets  in  high-resolution,  fully  polarimetric,  synthetic- 
aperture  radar  (SAR)  imagery.  This  paper  summarizes  our  work  in  SAR 
automatic  target  recognition  by  discussing  the  prescreening,  discrimination,  and 
classification  algorithms  we  have  developed;  data  from  5  km2  of  clutter  and  339 
targets  were  used  to  study  the  performance  of  these  algorithms.  The  prescreener 
required  a  low  threshold  to  detect  most  of  the  targets  in  the  data,  which  resulted 
in  a  high  density  of  false  alarms.  The  discriminator  and  classifier  stages  then 
reduced  this  false-alarm  density  by  a  factor  of  100.  We  improved  target- 
detection  performance  by  using  fully  polarimetric  imagery  processed  by  the 
polarimetric  whitening  filter  (PWF),  rather  than  by  using  single-channel 
imagery.  In  addition,  the  PWF-processed  imagery  improved  the  probability  of 
correct  classification  in  a  four-class  (tank,  armored  personnel  carrier,  howitzer, 
or  clutter)  classifier. 


The  arpa-sponsored  warbreaker  program  is 
a  broad-based  advanced  technology  program 
to  develop  new  weapons  technology  that  can 
locate  and  destroy  critical  mobile  targets  such  as  SCUD 
launch  systems  and  other  highly  mobile  platforms. 
Automatic  target  recognition  (ATR)  is  an  important 
candidate  technology  for  this  effort.  To  address  the 
surveillance  and  targeting  aspects  of  the  Warbreaker 
program,  Lincoln  Laboratory  has  developed  a  com¬ 
plete,  end-to-end,  2-D  synthetic-aperture  radar  (SAR) 
AI  R  system.  This  system  requires  a  sensor  that  can 
search  large  areas  and  also  provide  fine  enough  resolu¬ 
tion  to  detect  and  identify  mobile  targets  in  a  variety 
of  landscapes  and  deployments. 

The  Lincoln  Laboratory  ATR  system  has  three 
basic  stages:  detection  (or  prescreening ),  discrimina¬ 


tion,  and  classification  (see  Figure  1).  In  the  pre¬ 
screening  stage,  a  two-parameter  constant-false-alarm- 
rate  (CFAR)  detector  selects  candidate  targets  in  a 
SAR  image  by  examining  the  amplitude  of  the  radar 
signal  in  each  pixel  of  the  image.  In  the  discrimina¬ 
tion  stage,  a  target-sized  2-D  matched  filter  accu¬ 
rately  locates  the  candidate  targets  and  determines 
their  orientation.  Then  texture-discrimination  features 
(standard  deviation,  fractal  dimension,  and  weighted- 
rank  fill  ratio)  are  used  to  reject  natural-clutter  false 
alarms  [1].  In  the  classification  stage,  a  2-D  pattern¬ 
matching  algorithm  rejects  cultural-clutter  false  alarms 
(i.e.,  man-made  objects  that  are  not  targets)  and  clas¬ 
sifies  the  remaining  detections  by  target  type  (tank, 
armored  personnel  carrier,  or  howitzer). 

To  evaluate  the  performance  of  the  ATR  system. 


•  NOVAK  r  1  At  . 

/ \ /'/ill  I'i./ti  ii  ot  !  ![“> '  lu'i’illllttll  i  ‘I’l.lii  lr/t  i>  it  VIA  hi  lit  t  I'i/'j;/ ','lilfl 


Input 

image 


Rejects  imagery  without  Rejects  natural-clutter  Rejects  man-made 
potential  targets  false  alarms  clutter 


Classifies 

targets 


FIGURE  1.  Block  diagram  of  the  SAR  automatic  target  recognition  system.  The  prescreener  locates  candidate  targets 
based  on  the  brightness  of  pixels  in  the  input  image,  the  discriminator  rejects  natural-clutter  false  alarms,  and  the  classifier 
rejects  non-target  cultural  clutter  and  classifies  targets  by  vehicle  type. 


we  used  high-resolution  (1  ft  bv  1  ft),  full}-  polarimet- 
rie  target  data  and  clutter  data  gathered  bv  the  Lin¬ 
coln  Laboratory  millimeter-wave  SAR  [2]  at  a  depres¬ 
sion  angle  of  22.3°  and  a  slant  range  of  7  km.  We 
demonstrated  the  robustness  of  the  AI  R  system  by 
resting  it  against  targets  both  with  and  without  radar 
camouflage. 

Figure  2  is  an  example  of  the  quality  of  the  imagery 
fathered  bv  the  Lincoln  Laboratory  SAR.  In  this 
image  of  a  golf  course  in  Stockbridge,  New  York,  the 
high  resolution  of  the  SAR  resolves  individual  trees 
and  bushes  as  well  as  small  objects  such  as  the  flagpole 
located  in  the  center  of  the  putting  green.  This  par¬ 
ticular  SAR  image  was  obtained  under  clear  weather 
conditions;  the  quality  and  resolution  of  the  image 
would  not  have  been  degraded,  however,  by  dense  fog 
or  thick  cloud  cover.  I  bus  a  SAR  sensor  has  a  signifi¬ 
cant  advantage  over  optical  sensors.  SAR  image  qual¬ 
ity  is  not  dependent  on  weather  conditions,  and  the 
sensor  can  be  used  at  any  time  of  day  or  night.  In 
addition,  SAR  sensors  can  perform  other  tasks,  such 
as  searching  large  areas  from  a  long  distance. 

"Lite  image  in  Figure  2  was  constructed  from  fully 
polarimetric  SAR  data  that  were  processed  with  a 
technique  known  as  the  polarimetric  whitening  filter 
(I’WF)  [3],  PWF  processing  optimally  combines  the 
H H  (horizontal  transmit,  horizontal  receive),  HV 
(horizontal  transmit,  vertical  receive),  and  VV  (verti¬ 
cal  transmit,  vertical  receive)  polarization  components 
of  the  radar  return.  This  polarimetric  combination 
enhances  the  quality  of  the  imagery  in  two  ways: 
(1)  the  amount  of  speckle  in  the  imagery  is  mini¬ 
mized,  and  (2)  the  edges  of  objects  in  the  image  (such 


as  the  pond)  are  sharper.  As  a  result,  PWF -processed 
imagery  is  visually  clearer  than  single-polarimetric- 
channel  imagery.  In  addition,  PWF-processed  imag¬ 
ers-  improves  the  performance  of  all  three  stages  of  the 
A  I  R  system  (compared  with  the  performance  achieved 
by  using  single-polarimetric-channel  imagery)  because 
PWF  processing  reduces  clutter  variance  and  enhances 
target  signatures  relative  to  the  clutter  background. 

l  his  article  begins  with  an  overview  of  the  three 
stages  of  the  baseline  AI  R  system.  Next  we  describe 
the  performance  of  the  AI  R  system  with  both  cam¬ 
ouflaged  and  uncamouflaged  targets,  and  then  we 
compare  performance  using  PWF  data  with  perfor¬ 
mance  using  single-channel  (F1H)  data.  We  also 
present  details  of  the  three  discrimination  features 
used  in  our  studies,  with  particular  emphasis  on  the 
fractal-dimension  feature.  Finally,  we  discuss  future 
improvements  to  the  discrimination  and  classifica¬ 
tion  stages. 

Overview  of  the  Baseline  ATR  System 

This  section  describes  our  three-stage  baseline  SAR 
ATR  system,  which  is  illustrated  in  Figure  1  by  a  sim¬ 
plified  block  diagram.  I  he  three  stages — prescreener, 
discriminator,  and  classifier — are  described  below. 

Stage  1:  Prescreener 

in  the  first  stage  of  processing,  a  two-parameter  (TAR 
detector  |4)  is  used  as  a  prescreener:  this  stage  of 
processing  identifies  potential  targets  in  the  image  on 
the  basis  of  radar  amplitude  (i.e.,  by  searching  for 
bright  returns).  Computation  time  tor  this  stage  of 
processing  is  significantly  reduced  bv  operating  the 


12 


•  NOVAK  ET  AL. 

Performance  of  a  High- Resolution  PoLiri  metric  SAR  Automatic  Target  Recognition  System 


detector  at  a  reduced  resolution  (1  m  by  1  m)  rather 
than  at  the  full  resolution  (1  ft  by  1  ft). 

Figure  3  is  a  sketch  of  the  two-parameter  CFAR 
detector  used  by  the  prescreener;  the  detector  is  de¬ 
fined  by  the  rule 

X,  -  „ 

—l - >  ac:far  =>  target  ■ 

(T. 

x,  -  ,,  ,  (1) 

- : - s  ^c-.FAR  =>  clutter , 

a . 

where  Xt  is  the  amplitude  of  the  test  cell,  /i(.  is  the 
estimated  mean  of  the  clutter  amplitude,  a t.  is  the 
estimated  standard  deviation  of  the  clutter  amplitude. 


and  is  a  constant  threshold  value  that  defines 

the  false-alarm  rate.  As  shown  in  the  figure,  the  test 
cell  is  at  the  center  of  a  defined  local  region,  and  the 
80  cells  in  the  boundary  stencil  are  used  to  estimate 
the  mean  and  standard  deviation  of  the  local  clutter. 
The  guard  area  ensures  that  no  target  cells  are  in¬ 
cluded  in  the  estimation  of  the  clutter  statistics.  If  the 
detection  statistic  calculated  in  Equation  1  exceeds 
^c:FAR'  r^e  test  ce^  ‘s  declared  to  be  a  target  pixel;  if 
not,  it  is  declared  to  be  a  clutter  pixel. 

When  the  amplitude  distribution  of  the  clutter  is 
Gaussian,  the  CFAR  detector  provides  a  constant 
false-alarm  rate  for  any  given  [5].  Because  the 

clutter  distributions  of  high-resolution  data  are  rypi- 


FIGURE  2.  High  resolution  (1  ft  by  1  ft)  synthetic-aperture  radar  (SAR)  image  of  a  golf  course  near  Stockbridge, 
New  York.  Polarimetric  whitening  filter  (PWF)  processing  was  used  to  produce  this  minimum-speckle  image. 
The  radar  is  located  at  the  top  of  the  image;  therefore,  the  radar  shadows  go  toward  the  bottom  of  the  page. 
Notice  that  the  SAR  can  resolve  details  as  small  as  the  flagpole  in  the  putting  green  near  the  center  of  the  image. 


VOlUVf  fi  MJWBf «  ’  '■ISJ  TH1  llSCfll*  UBOMlODl  jOURMi 


13 


•  NOVAK  H  I  AL. 

Perfornumc ?  of ./  High- Resolution  I'oLtnmeirie  S  AR  Aunmutn  I  urge!  Recognition  System 


ini  rrrrm  i  rnrn 

— 

— 

— 

Guard  area 

— 

— 

Test  cell  Target 

— 

1  1  1  1  II  1  1  1  1  1  1  1  II  ITT 

FIGURE  3.  The  prescreener  CFAR  detector.  The  ampli¬ 
tude  of  the  test  cell  is  compared  with  the  mean  and  stan¬ 
dard  deviation  of  the  clutter.  The  boundary  consists  of  80 
cells  that  are  used  for  clutter  statistics  estimation.  Each 
cell  in  the  boundary  consists  of  16  raw  pixels  that  are 
noncoherently  averaged.  The  guard  area  ensures  that  no 
target  cells  are  included  in  the  clutter  statistics  estimation. 

cally  not  Gaussian  [6],  however,  the  detector  does  not 
always  yield  a  constant  htlse-alarm  rate.  In  spite  of 
this  fact,  the  detector  given  by  Equation  1  still  proves 
to  be  an  effective  algorithm  for  detecting  targets  in 
clutter. 

Only  those  test  cells  whose  amplitudes  stand  out 
from  the  surrounding  cells  are  declared  to  be  targets. 
The  higher  we  set  the  threshold  value  of  the 

more  a  test  cell  must  stand  out  from  its  background 
for  the  cell  to  be  declared  a  target.  Because  a  single 
taiget  can  produce  multiple  CFAR  detections,  the 
detected  pixels  are  clustered  (grouped  together)  by 
the  detector  if  they  are  within  a  target-sized  neighbor¬ 
hood.  Then  a  1 20-ft-by-l 20-ft  region  of  interest  of 
full  resolution  (1  ft  by  1  ft)  data  around  each  cluster 
centroid  is  extracted  and  passed  to  the  discrimination 
stage  of  the  algorithm  for  further  processing. 

Stage  2:  Discriminator 

The  discrimination  stage  takes  as  its  input  the  1 20-ft- 
by-l  20-ft  regions  of  interest  passed  to  it  by  the 
prescreener,  and  it  analyzes  each  region  f  interest  at 


full  resolution  (1  ft  by  1  ft).  The  goal  of  discrimina¬ 
tion  processing  is  to  reject  the  regions  containing 
natural-clutter  false  alarms  while  accepting  the  re¬ 
gions  containing  real  targets.  This  stage  consists  of 
three  steps:  (1)  determining  the  position  and  orienta¬ 
tion  of  a  detected  object,  (2)  computing  simple  tex¬ 
tural  features,  and  (3)  combining  the  features  into  a 
discrimination  statistic  that  measures  how  targetlike 
the  detected  object  is. 

In  the  first  step  of  the  discrimination  stage  the 
algorithm  determines  the  position  and  orientation  of 
the  target  by  placing  a  target-sized  rectangular  tem¬ 
plate  on  the  image.  The  algorithm  then  slides  and 
rotates  the  template  until  the  energy  within  the  tem¬ 
plate  is  maximized.  The  position  estimate  produced 
in  the  discrimination  stage  is  more  accurate  than  the 
position  estimate  produced  in  the  prescreening  stage. 
This  operation  is  computationally  feasible  because  it 
is  performed  only  on  the  selected  high-resolution  re¬ 
gions  of  interest  passed  by  the  prescreener,  and  not  on 
the  entire  image.  Mathematically,  this  operation  is 
equivalent  to  processing  the  data  in  the  region  of 
interest  with  a  2-D  matched  filter  for  the  case  when 
the  orientation  of  the  target  is  unknown. 

In  the  second  step  of  the  discrimination  stage  the 
algorithm  calculates  three  textural  features:  (1)  the 
standard  deviation  of  the  data  within  the  target-sized 
template,  (2)  the  fractal  dimension  of  the  pixels  in  the 
region  of  interest,  and  (3)  the  weighted-rank  fill  ratio 
of  the  data  within  the  template.  The  standard  devia¬ 
tion  of  the  data  within  the  template  is  a  statistical 
measure  of  the  fluctuation  of  the  pixel  intensities: 
targets  typically  exhibit  significantly  larger  standard 
deviations  than  natural  clutter.  The  fractal  dimension 
of  the  pixels  in  the  region  of  interest  provides  infor¬ 
mation  about  the  spatial  distribution  of  the  brightest 
scatterers  of  the  detected  object.  It  complements  the 
standard-deviation  feature,  which  depends  only  on 
the  intensities  of  the  scatterers  and  not  on  their  spatial 
locations.  The  weighted-rank  fill  ratio  of  the  data 
within  the  template  measures  the  fraction  of  the  total 
power  contained  in  the  brightest  5%  of  the  detected 
object’s  scatterers.  For  targets,  a  significant  portion  of 
the  total  power  comes  from  a  small  number  of  very 
bright  scatterers;  for  natural  clutter,  the  total  power  is 
distributed  more  evenly  among  the  scatterers. 


•  NOVAK  ET  AL. 

Performance  of  a  High- Resolution  Pohirimetrn  SAR  Automatic  larget  Recognition  System 


In  the  third  step  of  the  discrimination  stage  the 
algorithm  combines  the  three  textural  features  into  a 
single  discrimination  statistic;  this  discrimination  sta¬ 
tistic  is  calculated  as  a  quadratic  distance  measure¬ 
ment  (see  the  accompanying  article  entitled  “Dis¬ 
criminating  Pargets  from  Clutter”  by  Daniel  E. 
Kreithen  et  al.).  Most  natural-clutter  false  alarms  have 
a  large  quadratic  distance  and  are  rejected  at  this 
stage.  Most  man-made  clutter  discretes  (such  as  build¬ 
ings  and  bridges)  pass  the  discrimination  stage;  there¬ 
fore,  the  next  stage — the  classifier  stage — must  have 
the  ability  to  reject  them. 

Stage  3:  Classifier 

A  2-D  pattern-matching  classifier  rejects  cultural  false 
alarms  caused  by  man-made  clutter  discretes  and  then 
classifies  target  detections  by  vehicle  type.  In  our 
studies  we  implemented  a  four-class  classifier  (tank, 
armored  personnel  carrier,  howitzer,  and  clutter)  us¬ 
ing  high-resolution  (1  ft  by  1  ft)  PWF  imagery.  De¬ 
tected  objects  that  pass  the  discrimination  stage  are 
matched  against  stored  references  of  the  tank,  ar¬ 
mored  personnel  carrier,  and  howitzer.  If  none  of  the 
matches  exceeds  a  minimum  required  score,  the  de¬ 
tected  object  is  classified  as  clutter;  otherwise,  the 
detected  object  is  assigned  to  the  class  (tank,  armored 
personnel  carrier,  or  howitzer)  with  the  highest  match 
score. 

The  pattern-matching  references  used  in  the  classi¬ 
fier  were  constructed  by  averaging  five  consecutive 


spotlight-mode  images  of  a  target  collected  at  1°  in¬ 
crements  of  azimuth,  yielding  72  smoothed  images  of 
each  of  the  targets.  Figure  4  shows  typical  pattern- 
matching  references  for  the  three  targets  at  a  particu¬ 
lar  aspect  angle. 

Performance  of  the  Baseline  ATR  System 

This  section  describes  the  performance  of  the  pre¬ 
screening,  discrimination,  and  classification  stages  of 
the  baseline  SAR  ATR  system.  Clutter  data  from  5 
km“  of  ground  area  were  processed  through  the  ATR- 
system  algorithms,  along  with  data  for  162  camou¬ 
flaged  targets  and  177  uncamouflaged  targets.  The 
camouflaged  target  data  used  in  this  study  represent  a 
difficult  scenario  in  which  the  test  targets  were  realis¬ 
tically  deployed  and  covered  with  radar  camouflage. 
The  training  data  used  to  design  the  AI  R  system 
were  taken  from  the  uncamouflaged  targets.  The  clut¬ 
ter  data  contained  a  moderate  number  of  man-made 
clutter  discretes. 

The  CFAR  detection  threshold  in  rhe  prescre^ner 
was  set  relatively  low  to  obtain  a  high  initial  probabil¬ 
ity  of  detection  (PD)  for  the  target  data.  At  the  output 
of  the  prescreener,  PD  =  1.00  was  obtained  for  the 
uncamouflaged  targets,  while  Pd  =  0.82  was  obtained 
for  the  camouflaged  targets.  At  this  CFAR  threshold, 
a  false-alarm  density  of  approximately  30  false  alarms 
per  km“  (FA/knT)  was  obtained.  The  initial  detection 
processing  was  carried  out  at  reduced  resolution  (1  m 
by  1  m). 


(a)  (b)  (c) 


FIGURE  4.  Typical  pattern-matching  reference  templates  for  (a)  a  tank,  (b)  an  armored  personnel  carrier,  and  (c) 
a  howitzer.  These  pattern-matcm ud  -f '  ences  are  used  to  classify  detected  objects  by  vehicle  type.  They  are 
constructed  by  averaging  five  consecutive  spotlight-mode  images  of  a  target  collected  at  1°  increments  of 
azimuth. 


VOlll'.'l  6.  SDVBER  '  'It'i.l  I  HE  UMimt  IAB0RM0RV  JOURNAL 


15 


•  NOVAK  l  I  Al . 

1\  ffm-ttj.iiiti  /'/  ,i  III"/'  lu  ip/lt  luni  I’lil.i.iln,  an  \  I  A'  .  llilulli.illi  l.li-’tl  Si-/,,,; 


Table  1.  Overview  of  ATR  System  Performance. 


FAIkm 2 

Po 

Uncamouflaged 

Targets* 

Pd 

Camouflaged 

Targets 

After  prescreening 

30 

1.00 

0.82 

After  discrimination 

3.0 

1.00 

0.75 

After  classification 

0.3 

1,00 

0.70 

*  The  uncamouflaged  target  test  data  was  used  for  algorithm  training. 


F.ach  detected  region  of  interest  (containing  a  po¬ 
tential  target)  was  passed  to  the  discrimination  stage 
tor  further  processing  at  toll  resolution  (1  ft  bv  1  ft). 

I’he  discrimination  stage  determined  the  location  and 
orientation  of  each  detected  object,  and  then  calcu¬ 
lated  the  textural  features  (standard  deviation,  fractal 
dimension,  and  weighted-rank  fill  ratio)  that  were 
used  to  reject  natural-clutter  discretes.  Discrimina¬ 
tion  processing  reduced  the  talse-alarm  density  by  a 
factor  of  10,  to  3  FA/km".  No  uncamouflaged  targets 
were  rejected  by  the  textural-feature  tests;  thus  the 
initial  /’/,  of  1.00  was  maintained.  A  few  of  the  cam¬ 
ouflaged  targets  were  rejected  by  the  textural-feature 
tests,  which  resulted  in  Pp  =  0.73  for  these  targets. 

In  the  classification  stage  of  processing,  the  2-D 
pattern-matcher  was  applied  to  those  detections  which 
had  passed  the  discrimination  stage.  Classification 
processing  reduced  the  false-alarm  rate  by  another 
factor  of  10,  to  approximately  0.3  FA/km~.  No 
uncamouflaged  targets  were  rejected  by  the  pattern 
matcher  (resulting  in  Pp  =  1 .00  for  these  targets),  but 
some  camouflaged  targets  were  incorrectly  classified 
as  clutter  (resulting  in  Pp  =  0.70  for  these  targets). 

Fable  1  summarizes  the  performance  of  all  three 
stages  of  the  ATR  system.  The  uncamouflaged  target 
data  were  used  for  training  the  discrimination  and 
classification  stages.  I  he  thresholds  of  the  algorithms 
were  set  so  that  perfect  performance  was  achieved 
with  the  uncamouflaged  data.  Once  the  thresholds 
had  been  set  in  this  way,  the  clutter  and  camouflaged 
targets  were  processed. 

Figure  3  illustrates  how  clutter  rejection  was  imple¬ 


mented  by  the  pattern-matching  classifier.  As  the  fig¬ 
ure  shows,  most  of  the  clutter  discretes  had  correla¬ 
tion  scores  below  the  threshold  value  of  and  thus 
were  rejected  (i.e..  classified  as  clutter).  Detected  ob¬ 
jects  with  correlation  scores  equal  to  or  greater  than 
0.7  were  considered  to  be  sufficiently  targetlike,  and 
were  classified  according  to  their  highest  correlation 
scores  (as  tank,  armored  personnel  carrier,  or  howit¬ 
zer).  Figure  3  also  indicates  that  only  a  small  traction 
of  the  camouflaged  targets  were  declared  to  be  clutter 
because  of  low  correlation  scores. 

I  he  second  function  of  the  classifier  is  to  assign 


-0.2  0.0  0.2  0.4  0.6  0.8  1.0 

Correlation 

FIGURE  5.  Overview  of  classifier  performance.  The  clutter 
and  target  data  were  processed  separately.  Most  of  the 
clutter  had  correlation  scores  below  the  threshold  value  of 
0.7,  and  was  rejected.  Detected  objects  above  the  thresh¬ 
old  were  classified  according  to  their  highest  correlation 
scores. 


16 


•  NOVAK  ET  AL. 

Performance  of  a  High-Resolution  Polarimetric  SAR  Automatic  Target  Recognition  System 


Table  2.  Correlation  Pattern-Matching 
Classifier  Performance 


(training  data:  uncamouflaged  targets; 
test  data:  camouflaged  targets  and  clutter  discretes) 

Percent  Classified  as 


Tank 

APC 

Howitzer 

Clutter 

Tank 

89% 

11% 

0% 

0% 

A  PC 

0% 

96% 

0% 

4% 

Howitzer 

0% 

13% 

71% 

16% 

Clutter 

0% 

14% 

0% 

86% 

objects  accepted  as  targets  to  target  classes  (tank,  ar¬ 
mored  personnel  carrier,  howitzer).  Table  2  shows  the 
classification  performance  of  the  baseline  classifier  as 
a  confusion  matrix  that  tabulates  the  correct  and 
incorrect  classifications.  Recall  that  the  classifier  used 
templates  constructed  from  uncamoutlaged  targets; 
the  classification  results  shown  in  Table  2  are  for 
clutter  discretes  and  camouflaged  test  targets  that 
passed  the  detection  and  discrimination  stages.  At  the 
output  of  the  classification  stage,  70%  of  the  camou¬ 


flaged  targets  were  classified  as  targets,  and  85%  of 
those  targets  were  correctly  classified  bv  vehicle  type. 

I  he  four-class,  2-D  pattern-matching  algorithm 
used  in  this  study  was  implemented  with  normalized 
dB  references,  which  provided  the  best  overall  perfor¬ 
mance  among  five  different  reference  schemes  that 
were  tested. 

ATR  Performance  Using  Single-Channel  Data 
versus  Fully  Polarimetric  Data 

We  compared  the  performance  of  the  ATR  system 
using  single-channel  (HH)  data  with  the  performance 
of  the  system  using  fully  polarimetric  PWF  data. 
Figure  6(a)  shows  an  HH-polarization  SAR  image  of 
a  scene  processed  to  reduced  resolution  (1  m  by  1  m). 
In  this  image  two  regions  of  trees  are  separated  by  a 
narrow  strip  of  coarse  scrub.  Also  visible  in  the  image, 
although  somewhat  faint,  are  four  power-line  towers 
located  in  the  scrub.  Figure  6(b)  shows  the  corre¬ 
sponding  PWF-processed  image  of  the  scene.  The 
power-line  towers  have  greater  intensity  in  the  PWF 
image  than  in  the  HH  image  because  the  PWF  image 
includes  contributions  from  HH,  HV,  and  W  polar¬ 
izations. 

Table  3  compares  APR  system  performance  using 
HH  versus  PWF  data.  The  comparison  was  performed 
by  using  the  same  target  and  clutter  data  used  to 


FIGURE  6.  Comparison  of  (a)  HH  and  (b)  PWF  imagery.  The  power-line  towers  are  more  clearly  visible  in  the  PWF  image 
because  PWF  processing  combines  data  from  all  three  polarization  channels  (HH,  HV,  and  VV). 

!hi  LACOl’,  iSBflSS'Ofi-  JOMSSl 


VO  HIVE  f>  MjVBt  R  '  "  6<i,i 


17 


•  NOVAK  U  At. 

Perfurmumr  uf  u  High- Resolution  I’uLirimcirn  SARAutuM.nu  l.irga  lit.  uumnuu  Sj >/,>« 


Table  3.  Comparison  of  ATR  System  Performance 
Using  HH  Imagery  versus  PWF  Imagery. 


FA/km 2 

Pd 

HH  Data 

Pd 

PWF  Data 

After  prescreening 

30 

0.65 

0.82 

After  discrimination 

3.0 

0.57 

0.75 

After  classification 

0.3 

0.24 

0.70 

generate  the  results  in  Table  1 .  At  the  output  of  each 
stage  (prescreening,  discrimination,  and  classification) 
the  false-alarm  densities  were  set  equal  for  HH  and 
PWF  clutter  imagery.  This  normalization  permits  us 
to  compare  the  HH  and  PWF  detection  performance 
at  each  stage. 

The  detection  performance  was  better  with  PWF 
data  than  with  HH  data.  At  the  output  of  the  detec¬ 
tion  stage  l'D  =  0.82  for  PWF  data  and  Pp  =  0.65  for 
HH  data.  At  the  output  of  the  discrimination  stage 
PD  =  0.75  for  PWF  data  and  PD  =  0.57  for  HH  data. 
At  the  output  of  the  classification  stage  PL)  =  0.70  tor 
PWF  data  and  PD  =  0.24  for  HH  data;  the  PD  at  the 
end  of  the  classification  stage  represents  the  overall 
end-to-end  performance  of  the  ATR  system. 

Details  of  the  Baseline  Discrimination  Features 

This  section  presents  details  of  the  three  baseline 
discrimination  features;  standard  deviation,  fractal  di¬ 
mension,  and  weighted-rank  fill  ratio.  The  equations 
for  calculating  each  feature  are  also  discussed.  Because 
the  concept  of  the  fractal-dimension  feature  is  fairly 
involved,  this  feature  is  discussed  at  greater  length 
than  the  other  two  features, 

Standard-Deviation  Feature 

The  standard-deviation  feature  is  a  measure  of  the 
fluctuation  in  intensity,  or  radar  cross  section,  in  an 
image.  The  log  standard  deviation  for  a  particular 
region  is  defined  as  the  standard  deviation  of  the 
radar  returns  (in  dB)  from  the  region.  If  the  radar 
intensity  in  power  from  range  r  and  azimuth  a  is 
denoted  by  P{r,  a),  then  the  log  standard  deviation  a 


can  be  estimated  as  follows: 

.S\  -  Sf  /  N 
\  N  -  1 

where 

5',=  ^  10logloT(r,rf)  (7) 

and 

5-  =  II,0l°gio/J(-,)r  (3) 

and  N  is  the  number  of  points  in  the  region. 
Fractal-Dimension  Feature 

The  fractal-dimension  feature  provides  a  measure  of 
the  spatial  distribution  of  the  brightest  scatterers  in  a 
region.  In  the  following  paragraphs  we  present  the 
formal  definition  of  fractal  dimension,  along  with 
several  simple  examples  to  illustrate  the  definition. 
We  also  show  how  to  calculate  the  fractal  dimension 
of  detected  objects  in  a  SAR  image.  By  using  high- 
resolution  SAR  imagery  gathered  at  Stockbridge,  New 
York,  we  demonstrate  how  the  spatial  texture  differ¬ 
ences  measured  by  the  fractal-dimension  feature  can 
be  used  to  discriminate  between  natural  and  cultural 
objects. 

The  fractal  dimension  of  a  set  5  in  a  two-dimen¬ 
sional  space  can  be  defined  as  follows: 

dim(S)  =  lim  !°-g  -f-  , 

log(i)  <4> 

where  Mf  =  the  minimum  number  of  f-by-f  boxes 
needed  to  cover  S.  (By  covering  5,  we  mean  finding  a 
set  of  square  boxes  B,  such  that  U  Bi  2  5.)  For  small 
values  of  r,  the  definition  in  Equation  4  is  equivalent 
to  writing 

Mf  =  Ke~d'm{S)  ,  (5) 

where  A' is  a  constant.  This  equation  expresses  one  of 
the  important  ideas  behind  fractal  analysis:  fractal 


18 


t‘[  USC0i\  LABORATORY  JOURNAL  VOHiVi  b  MiMBf R  *  »9W 


•  NOVAK  F  I  A1 . 

lWftmnauit‘  of  a  High- Resolution  RoLirimetru SAR  Auionmtn  l  urge  t  Recognition  System 


dimension  measures  how  certain  properties  of  a  set 
change  with  the  scale  ot  observation  t.  In  the  fol¬ 
lowing  paragraphs,  three  specific  examples  clarify 
this  idea. 

Ex ample  1.  L  et  ,S  be  a  single  point.  A  point  can  be 
covered  by  one  box  regardless  of  the  box  size  r,  hence 


dim(point)  =  lim 

,  -»o 


log  M  t 


lim 

i  -»o 


log  I 


=  0. 


We  could  use  F.quation  5  to  derive  this  same  result  bv 
noting  that  the  number  of  square  boxes  needed  to 
cover  .V  is  independent  ot  the  box  size;  thus  dim(point) 
equals  zero.  Figure  7  summarizes  this  example.  In 
addition,  as  long  as  e  is  below  a  certain  critical  value,  a 
finite  set  of  isolated  points  can  be  covered  by  a  fixed 
number  of  boxes  (independent  of  e).  Therefore,  a 
finite  set  of  isolated  points  also  has  u  dimension  of 
zero. 

Example  2.  L.et  5  be  a  line  segment.  For  simplicity, 
we  assume  the  line  is  1  unit  long.  A  single  1-unit-by- 
1-unir  box  can  cover  the  line.  If  we  reduce  the  box 
size  to  1/2  unit  by  1/2  unit,  then  two  boxes  are 


needed  to  cover  the  line.  If  we  reduce  the  box  size 
again,  to  1/4  unit  by  1/4  unit,  four  boxes  are  needed 
to  cover  the  line.  F.ach  time  the  box  size  is  halved,  the 
number  of  boxes  needed  to  cover  a  line  doubles;  thus 


dim(line) 


lim 
,  -*o 

//-»  X 


=  1 . 


Figure  8  summarizes  this  example.  It  can  also  be 
shown  that  a  finite  set  of  isolated  line  segments  has  a 
fractal  dimension  of  one. 

Example  3.  Let  S  be  a  square  area.  Again,  for  sim¬ 
plicity,  we  assume  the  square  is  I  unit  bv  1  unit  in 
size.  A  single  1  -unit-bv- 1  -unit  box  can  cover  the  square. 
It  we  reduce  the  box  size  to  1/2  unit  bv  1/2  unit,  tour 
boxes  are  required.  If  we  reduce  the  box  size  again,  to 
1/4  unit  by  1/4  unit,  16  boxes  are  required.  As  the 
box  size  is  halved,  the  number  of  boxes  needed  to 
cover  the  square  area  quadruples;  thus 

dim(square)  =  lim  — 5 
.  — *n 

"~*x  log 


r 

MU) 

i 

1 

1/2 

1 

1/4 

1 

f 

1 

Dim  (point)  =  lim  =  0. 

r-o  log(1/e) 


t 

MU) 

1 

1 

1/2 

2 

1/4 

4 

f 

1/f 

Dim  (line)  =  lim  !®2<!/L>=i. 
f~o  log(1/f) 


FIGURE  7.  Fractal-dimension  calculation  for  a  point.  As 
the  size  of  the  square  box  that  covers  the  point  decreases, 
the  number  of  boxes  required  to  cover  the  point  remains 
the  same  (i.e.,  one).  As  a  result,  the  fractal  dimension  of  a 
point  is  zero. 


FIGURE  8.  Fractal-dimension  calculation  for  a  line  seg¬ 
ment.  As  the  size  of  each  square  box  that  covers  the  line 
segment  is  halved,  the  number  of  boxes  required  to  cover 
the  line  segment  doubles.  As  a  result,  the  fractal  dimen¬ 
sion  of  a  line  segment  is  one. 


if'l-.ivl  ti  ViVBfh*  '  iijF  I  !\n>:  \  ;  JBORi r*'is 


19 


•  NOVAK  ETAJL 

lWfor/Mtt,ite of  a  High-Reutluiton  koLmnu tru  SAH  Autoftuitii  Litoft 


Figure  9  summarizes  this  example.  We  used  a 
square  area  in  this  example  tor  convenience.  Any 
area  that  can  he  synthesized  from  a  Finite  number  ot 
square  areas,  however,  will  have  a  fractal  dimension 
of  two. 

From  these  simple  examples,  we  see  that  fractal 
dimension  clearly  has  the  potential  to  discriminate 
between  certain  types  ot  objects  in  2-D  space.  I  he 
question  is,  how  can  this  feature  be  applied  to  SAR 
data? 

The  first  step  in  applying  the  fractal-dimension 
concept  to  a  radar  image  is  to  select  an  appropriately- 
sized  region  of  interest,  and  then  convert  the  pixel 
values  in  the  region  of  interesr  to  binary  (i.e.,  each 
pixel  value  equals  0  or  1 ).  One  method  of  performing 
this  conversion  is  amplitude  thresholding,  in  which  all 
pixel  values  exceeding  a  specified  threshold  are  con¬ 
verted  to  1,  and  the  remaining  pixel  values  are  con¬ 
verted  to  0.  Another  method  is  to  select  the  N bright¬ 
est  pixels  in  the  region  of  interest  and  convert  their 
values  to  1 ,  while  converting  the  rest  of  the  pixel 
values  to  0;  this  second  method  is  the  approach  we 
used  (because  it  worked  better). 

After  converting  the  radar  image  to  a  binary  image. 


Dim  (square)  =  lim  -  =  2. 

f-o  log(1/e) 

FIGURE  9.  Fractal-dimension  calculation  for  a  square  area. 
Each  time  the  size  of  each  square  box  that  covers  the 
square  area  is  halved,  the  number  of  boxes  required  to 
cover  the  square  quadruples.  As  a  result,  the  fractal  di¬ 
mension  of  a  square  is  two. 


we  let  the  pixels  with  the  value  1  constitute  the  set  ,S 
in  Equation  4.  A  problem  arises,  however,  when  we 
attempt  to  applv  the  definition  ot  Equation  4  directlv 
to  our  binarv  image.  According  to  the  definition  ot 
fractal  dimension,  we  need  to  take  a  limit  as  the  box 
size  t  goes  to  zero.  I  he  smallest  meaningful  value  ot 
the  box  size  t\  however,  is  the  size  of  one  pixel. 
T  herefore,  we  must  develop  an  approximation  to  the 
formula  ot  Equation  4. 

From  Equation  5  we  observe  that 

log  A/  =  -  dim  log  t  +  log  A 

for  small  t.  Because  the  relation  between  log  A/,  and 
log  r  is  linear  for  small  t  ,  with  the  slope  equal  to  the 
negative  of  the  dimension,  the  fractal  dimension  can 
be  approximated  by  using  only  the  box  counts  lor 
t  =  1  and  e  =  2  in  the  following  way: 

log  A/ 1  -  log  A/,  log  A/,  -  log  A!-, 

dim  = - - - 1 - -  =  — — — L— — - -  , 

log  1  -  log  2  log  2 

(6) 

where  A/,  is  the  number  of  I  -pixel-bv- 1  -pixel  boxes 
needed  to  cover  the  image  and  A/,  is  the  number  ot 
2-pixel-bv-2-pixel  boxes  needed  to  cover  the 
image.  Figure  10  summarizes  the  fractal  dimensions 
ot  simple  objects  as  they  arc  observed  in  SAR 
imagery. 

The  following  paragraphs  provide  two  examples  of 
calculating  the  fractal  dimension  of  regions  of  interest 
in  radar  imagery.  The  examples  use  data  extracted 
from  the  SAR  image  shown  in  Figure  1  1.  The  figure 
shows  a  Stockbridge,  New  York,  clutter  scene  that 
includes  trees,  a  street  with  houses  on  both  sides,  a 
swimming  pool,  and  a  meadow.  The  examples  dem¬ 
onstrate  the  fractal-dimension  calculation  for  a  typi¬ 
cal  tree  (natural  clutter)  and  the  rooftop  of  a  house 
(cultural  clutter). 

Figure  12  illustrates  the  fractal-dimension  calcula¬ 
tion  for  a  binary  image  of  a  tree;  the  binary  image  was 
formed  by  selecting  the  SO  brightest  pixels  from  a 
1 20-ft-by-l 20-ft  region  of  interest  in  the  image  ot 
Figure  11.  The  number  of  1 -pixel-bv- 1 -pixel  boxes 
needed  to  cover  this  image  is  identical  to  the  number 
of  pixels  with  the  value  1  (i.e.,  equals  50).  The 


F 

M(t) 

1 

1 

1/2 

4 

1/4 

16 

f 

1/r2 

•  NOVAK  t  I  At . 

/’i  I'lttnnditu  itf  ,/  A',  1‘nLhi 'nt  li:<  S.-1AJ  {uimtuiiu  I.uqi  lu  , V 


Point  Line 


Fractal  dimension  =  0  Fractal  dimension  =  1 

(a)  (b) 


Square 


Fractal  dimension  =  2 
(c) 


FIGURE  10.  Fractal  dimensions  of  simple  objects  in  SAR  imagery,  (a)  Points  have  a  fractal 
dimension  of  zero,  (b)  lines  have  a  fractal  dimension  of  one,  and  (c)  squares  and  L-shaped 
objects  have  a  fractal  dimension  of  two.  In  the  text  we  show  how  to  calculate  the  fractal  dimen¬ 
sions  of  these  objects  by  using  the  approximation  derived  in  Equation  6. 


minimum  number  of  2-pixel-bv- 2-pixel  boxes  needed 
to  cover  the  image  is  41:  Figure  12  shows  this  mini¬ 
mal  covering.  By  applying  Equation  6,  we  find  that 
the  fractal  dimension  of  the  tree  is  0.29.  This  rela¬ 
tively  low  value  reflects  the  fact  that  the  binary  image 
of  the  tree  consists  primarily  of  isolated  pixels. 

Figure  13  illustrates  the  fractal-dimension  calcula- 
tion  for  a  binary  image  of  a  house  rooftop  (this  image 
was  formed  in  the  same  way  as  the  image  of  the  tree  in 
Figure  12).  Notice  that  the  pixels  in  this  image  are 
clustered  into  lines  and  areas.  I  he  number  of  1-pixel- 
bv-1 -pixel  boxes  needed  to  cover  the  image  is  30,  but 
the  minimum  number  of  2-pixeI-by-2-pixeI  boxes 
needed  to  cover  the  image  is  only  21.  By  using  Filia¬ 
tion  6,  we  find  that  the  fractal  dimension  of  the  house- 
rooftop  is  1.23.  I  his  relatively  high  value  is 
caused  by  the  clustering  of  the  pixels.  I  he  different 
fractal-dimension  values  for  the  tree  and  the 
rooftop  illustrate  that  this  feature  can  be  used  to 
discriminate  between  natural  clutter  and  cultural 
clutter. 

Weighted-Rank  Fill  Ratio  Feature 

The  third  textural  feature,  the  weighted-rank  till  ra¬ 
tio,  measures  the  percentage  of  the  total  energy  con¬ 


tained  in  the  brightest  scatterers  of  a  detected  object. 
Using  the  notation  of  Equations  2  and  3,  we  define 
the  weighted-rank  till  ratio  ij  as  follows: 

2>-"> 

(('briuhuM  p!\vU 

})  =  * 

.ill  pixels 

where  k  is  selected  to  correspond  approximately  to 
the  brightest  3°o  of  the  detected  objects  pixels.  For 
man-made  objects  a  significant  portion  of  the  total 
energy  comes  from  a  small  number  of  bright  scatter- 
ers;  for  natural  clutter  the  total  energy  is  distributed 
more  evenly  among  the  pixels. 

Future  ATR  System  Improvements 

I  he  baseline  A1  R  system  currently  uses  only  three- 
features  in  the  discrimination  stage  (standard  devia¬ 
tion,  fractal  dimension,  and  weighted-rank  fill  ratio): 
we  have  found  that  these  features  reliably  reject  natu¬ 
ral-clutter  false  alarms.  Other  discrimination  features 
could  be  added  that  would  also  reject  some  cultural- 
clutter  false  alarms.  For  example,  a  si/e  feature,  such 
as  length  and  width  of  the  detected  object,  could 


21 


•  NOVAK  1 1  A1 . 

l\yfo) iiiuiuc  <>t  it  Ht%h  Rt  stilnumi  I'nLnmiiti u  \AR  AutmuaUx  lai'iti  A', .nuuunm  m 


FIGURE  12.  Fractal-dimension  calculation  tor  the  binary 
image  of  a  tree.  The  50  brightest  pixels  (indicated  by  the 
small  black  boxes)  are  relatively  isolated,  and  41  two-pixel- 
by-two-pixel  boxes  are  needed  to  cover  them,  which  re¬ 
sults  in  a  low  fractal  dimension  of  0.29. 


E 


FIGURE  13.  Fractal-dimension  calculation  for  the  binary 
image  of  a  house  rooftop.  The  50  brightest  pixels  (indi¬ 
cated  by  the  small  black  boxes)  are  more  tightly  clustered 
than  they  are  for  the  tree  in  Figure  12,  and  only  21  two- 
pixel-by-two-pixel  boxes  are  needed  to  cover  them,  which 
results  in  a  higher  fractal  dimension  of  1.25. 

intuition  System  by  Using  3-D  Target  Information” 
by  Shawn  M.  Verbout  et  al.  presents  an  approach  to 
this  classification  task  based  on  the  generation  of  2-D 
templates  front  3-D  models  of  targets. 

Because  the  pattern-matching  approach  to  classifi¬ 
cation  requires  a  large  number  of  templates,  we  arc- 
investigating  an  alternative  approach  to  classification — 


using  spatial  matched  filters.  I  he  initial  results  of 
these  studies  indicate  the  possibility  of  a  significant 
reduction  m  computation  time  and  storage  require¬ 
ments  with  no  reduction  in  performance 

Acknowledgments 

1  he  authors  wish  to  acknowledge  the  significant  con- 
tributions  made  bv  Michael  U.  Burl,  particularly  to 
the  development  of  the  fractal-dimension  feature.  The 
authors  also  wish  to  thank  Steven  M.  Auerbach  for 
his  help  in  technical  editing  and  Robin  l  edorchuk  for 
her  outstanding  typing  efforts.  I  his  work  was  spon¬ 
sored  by  the  Advanced  Research  Projects  Agency. 


REFERENC  ES 

1 .  M.t  .  Burl,  C  i.).  Osvirka.  and  I  M.  Novak.  1  exture  Discrim¬ 
ination  in  Synthetic  Aperture  Radar."  J.W  Asi/mnnr  ('nut.  on 
Siffuk  Systems,  and  Computers,  I'aeifit  (lore.  (.A.  AO  Oil.  / 
Xnii  I  Wt,  p.  .199. 

2.  J.C  Henry.  ”  I  lie  I  incoln  laboratory  Is  (iH/  Airborne  I’ola- 
ri metric  SAR  Imaging  System,"  //:"/•"/■ .Will  / t/esyshni .<  (mitt 
Atlanta.  (iA.  26-  2_.i lar.  IWI.  p.  .15.1. 

.1.  I  M.  Novak.  M.t  '.  Burl.  R.  t  bancs  .  and  Ci.l.  Oxvirka.  "Opti¬ 
mal  Processing  of  Polarimctric  Synthetic  A|'ermre  Radar  Im¬ 
agers."  /  iiw.  / ,.ili.  /.  3.  2“  1  ( 1 

■t.  I  M.  Novak.  Ml.  Hurl,  and  W  AX  .  Irving.  "Optimal  Polari¬ 
mctric  Processing  for  I  nhaneed  I  arget  Detection.  //  /  /•  Hans. 
Aerosp.  ih'itron.  Syst.  29.  2.1-1  (19*11). 

5.  1 1.11.  Ctoldstein.  "babe  Alarm  Regulation  in  Log  Normal  and 

XXcibull  t  lutter.  II I  I'  lions.  Aerosp.  i.leetron.  S|  >r.  8.  8  t 
(19-.1). 

(>.  W  AX'.  Irving.  ( i. | .  Osvirka.  and  1  ,M.  Novak.  "A  Ness  Model 
loi  High-Resoiution  Polarimctric  SAR  t  lutter  Data.  Si'll- 
1630.  208  (1992). 

I  .M.  Novak,  tl. |.  Osvirka.  and  t '..M.  Netishen.  “Radar  Target 
idemilhation  l 'sing  Spatial  Matched  I  ilters."  Ard A  I’M  Systems 
and  leelmo/ogi ■  (  mif .  [line  1*19.1. 


23 


•  NOVAK  I  I  A1 . 

I‘,r/nnn,in,  i  of  it  /  Ityh-Hoolnlwn  I’liLniiiit  lrii  .VI H  Autamufr  I  ,iryel  Kntiyniiion 


l.F.SI  IF  M.  NOVAK  GREGORY  J.  OW1RKA  CHRISTINE  M.  NETISHEN 

is  a  senior  Mall  member  in  the  is  an  assistant  stab  member  in  is  an  assistant  stall  member  in 

Surveillance  Systems  group.  He  the  Surveillance  Systems  group.  the  Surveillance  Systems  group, 

received  a  B.S.L.L.  degree  from  He  received  a  B.S.  degree  (cum  i  ler  research  speciality  is  in  the 

Lairleigh  Dickinson  University  laude)  in  applied  mathematics  detection,  discrimination,  and 

in  1961,  an  M.S.L.L.  degree  from  Southeastern  Massachu-  classification  ol  stationary 

from  the  University  of  South-  setts  University,  and  he  is  ground  vehicles  in  SAR  imag- 

ern  California  in  1 96.1,  and  a  currently  working  on  an  M.S.  cry.  She  received  a  B.S.  degree 

I’ll. I).  degree  in  electrical  degree  in  electrical  engineering  in  mathematics  (cum  laude) 

engineering  from  the  University  at  Northeastern  University.  from  Providence  (  Allege;  she 

ol  California.  Los  Angeles,  in  dreg’s  current  research  interests  also  spent  a  year  at  Cambridge 

1 97 1 .  Since  1 977  i  .es  has  been  are  in  automatic  target  recogni-  University  in  Lngland  studying 

a  member  of  the  technical  staff  tion.  He  has  been  at  Lincoln  mathematics  and  physics, 

at  1  incoin  Laboratory,  where  he  Laboratory  since  1987.  ( '.hristine  has  been  at  Lincoln 

has  studied  the  detection.  Laboratory  since  1991. 

discrimination,  and  classifica¬ 
tion  of  radar  targets.  He  has 
contributed  chapters  on  sto¬ 
chastic  observer  theory  to  the 
series  Ar/i'iinees  in  Control 
Theory,  edited  bv  C.T.  I.eondes 
(Academic  Press,  New  York), 
volumes  9  and  1 2. 


Discriminating  Targets 
from  Clutter 


Daniel  E.  Kreithen,  Shawn  D.  Halversen,  and  Gregory  J.  Owirka 

■  The  Lincoln  Laboratory  multistage  target-detection  algorithm  for  synthetic- 
aperture  radar  (SAR)  imagery  can  be  separated  into  three  stages:  the 
prescreener,  the  discriminator,  and  the  classifier.  In  this  article,  we  focus  on  the 
discrimination  algorithm,  which  is  a  one-class,  feature-based  quadratic 
discriminator.  An  important  element  of  the  algorithm  design  is  the  choice  of 
features.  We  examine  fifteen  features  that  are  used  in  the  discrimination 
algorithm — three  features  developed  by  Lincoln  Laboratory,  nine  developed  by 
the  Environmental  Research  Institute  of  Michigan,  two  developed  by  Rockwell 
International  Corporation,  and  one  developed  by  Loral  Defense  Systems.  The 
set  of  best  features  from  this  pool  of  fifteen  was  determined  by  a  theoretical 
analysis,  and  was  then  verified  by  using  real  SAR  data.  Performance  was 
evaluated  for  a  number  of  different  cases:  for  fully  polarimetric  data  and  HH 
polarization  data  and  for  1-ft  resolution  data  and  1-m  resolution  data.  In  all 
cases  the  theoretical  performance  analysis  closely  matched  the  real  data 
performance.  This  closeness  demonstrates  a  good  understanding  of  the 
discrimination  algorithm.  In  addition,  we  formulate  a  set  of  criteria  for  best 
feature  choice  that  apply  to  quadratic  discrimination  algorithms  in  general. 


Lincoln  laboratory  has  procured  a  fully 
polarimetric,  instrumentation-quality,  high- 
resolution  (1  ft  by  1  ft),  35-GHz,  millimeter- 
wave  (MMW)  synthetic-aperture  radar  (SAR),  which 
has  been  used  to  gather  imagery  of  targets  of  interest 
and  clutter  in  a  number  of  different  locations  and 
deployments.  The  radar,  which  is  mounted  in  a 
Gulfstreanr  G-l  aircraft,  records  data  in-flight  onto 
24-track  magnetic  tapes.  The  tapes  are  then  processed 
on  the  ground  to  form  the  SAR  imagery.  A  recent 
Lincoln  Laboratory  Journal  article  by  Leslie  M.  Novak 
et  al.  describes  this  radar  system  [1], 

The  Surveillance  Systems  group  at  Lincoln  Labo¬ 
ratory  has  been  developing  algorithms  to  detect  tar¬ 
gets  of  interest  in  this  SAR  imagery.  A  block  diagram 
of  the  algorithm  suite  is  shown  in  Figure  1.  The 
target-detection  algorithm  suite  takes  the  form  of  a 
multistage  algorithm.  In  theory,  it  is  possible  to  con¬ 
struct  a  single  algorithm  that  performs  target  detec¬ 
tion  in  an  optimal  manner,  and  which  exploits  all  of 


the  information  present  in  a  high-resolution  SAR 
image.  Unfortunately,  it  is  often  difficult  to  design 
algorithms  using  the  single-algorithm  approach,  be¬ 
cause  high-resolution  SAR  imagery  is  difficult  to  model 
accurately  and  hence  is  poorly  understood.  The  mul¬ 
tistage  approach  becomes  an  attractive  alternative, 
because  of  the  reduction  in  required  computational 
capability  and  the  simplification  in  algorithm  design. 

The  Lincoln  Laboratory  multistage  algorithm  has 
three  separate  stages,  each  of  which  performs  easily 
identifiable  functions.  The  first  stage,  which  is  called 
the  prescreener ,  is  a  computationally  simple  algorithm 
whose  function  is  to  pass  all  targets  and  eliminate 
only  obviously  non-targetlike  naturally  occurring  clut¬ 
ter.  The  second  stage,  called  the  discriminator ,  ideally 
eliminates  all  naturally  occurring  clutter  that  has  been 
passed  by  the  prescreener,  and  passes  only  man-made 
objects  to  the  third  stage,  which  is  called  the  classifier. 
The  classifier  receives  all  man-made  objects  that  have 
been  passed  by  the  discriminator  and  categorizes  each 


VOlillM  6  XUMftlR  :  •'193  I  HI  |  Item  Ik  UBORAIOf).  JOURMII 


25 


•  KRl  1  HUN  1  1  Al. 

/  V,  J  /,  funij  (  lush  > 


Input 

Prescreener 

Discriminator 

Classifier 

image 

Classifies 

targets 


Rejects  imagery  without 
potential  targets 


Rejects  natural-clutter 
false  alarms 


Rejects  man-made 
clutter 


FIGURE  1.  Block  diagram  of  the  multistage  target-detection  algorithm.  This  article  concentrates  on  the  discriminator  stage. 


one  either  as  a  target  of  interest  (of  which  there  can  he 
a  number  of  classes)  or  as  an  uninteresting  man-made 
object. 

in  this  article,  we  concentrate  our  attention  on  the 
second  stage,  the  discriminator.  I  he  prescreener  stage 
is  covered  elsewhere  [2];  the  classification  stage  is  still 
under  development. 

Algorithm  Description 

T  he  discrimination  algorithm  used  in  the  Lincoln 
Laboratory  automatic  target-detection  algorithm  suite 
is  centered  around  a  one-class  quadratic  discriminator 
[3— Sj.  A  one-class  discriminator  is  trained  only  on  a 
target-training  set,  and  it  assumes  that  the  clutter 
false-alarm  dataset  ( i.e. ,  the  set  of  false  alarms  passed 
bv  the  prescreener  stage)  has  unknown  attributes  in  a 
feature  space.  Figure  2  illustrates  the  concept  of  this 
discrimination  algorithm.  For  each  region  of  interest, 
the  algorithm  produces  a  score  that  measures  the 
distance  from  the  candidate  to  the  center  of  the  tar¬ 
get-training  set  (in  a  feature  space).  When  the  algo¬ 
rithm  is  properly  trained,  a  lower  value  of  this 
distance  metric  indicates  a  more  targetlike  candidate. 

Kev  elements  of  the  discrimination  algorithm  are 
the  features  used  to  compute  the  distance  metric.  We 
cove  the  discrimination  features  used  in  the  Lincoln 
Laboratory  target-detection  algorithm  suite  in  the  next 
five  sections  of  this  article.  Subsequent  sections  cover 
the  discrimination  algorithm  itself  in  great  detail. 

/  )isiri»iiniitio>i  Ira  tuns 

A  number  of  attributes  that  are  present  in  the  full v 
polarimetric,  high-resolution  SAR  imagery  can  be  ex¬ 
ploited  to  discriminate  between  targets  and  clutter 


false  alarms.  These  attributes  include  si/e,  shape,  power, 
polarimetric  properties,  spatial  distribution  of  reflected 
power,  and  dimensionality.  Unfortunately,  at  present 
no  method  exists  for  developing  discrimination  fea¬ 
tures  to  exploit  these  attributes  in  any  optimal  fash¬ 
ion.  The  best  that  can  be  done  is  to  design  a  feature 
that  seems  to  exploit  a  specific  attribute,  and  then  test 
the  feature  on  a  variety  of  data  to  see  if  it  separates 
targets  from  natural-clutter  false  alarms.  If  it  does  not 
separate  targets  from  false  alarms,  the  feature  design  is 
obviously  poor;  if  it  does  separate  them,  then  the 
feature  may  be  a  good  one. 

T  he  other  major  criterion  that  a  feature  must  sat¬ 
isfy'  is  orthogonality.  In  simple  terms,  features  used 


FIGURE  2.  Conceptual  diagram  of  a  one-class  discrimina¬ 
tion  algorithm.  This  diagram  represents  a  two-dimensional 
feature  space.  If  the  separation  in  feature  space  is  less 
than  the  threshold,  then  the  region  of  interest  from  which 
the  input  vector  is  extracted  is  declared  to  be  a  target. 
Conversely,  if  the  separation  in  feature  space  is  greater 
than  the  threshold,  the  region  of  interest  is  declared  to  be 
clutter. 


26 


•  KRE1THEN  ET  AE. 

DiitTiuibitiliug  targets  from  Clutter 


together  in  a  discrimination  algorithm  must  measure 
different  attributes  of  the  region  of  interest.  For  ex- 
ample,  five  different  features  that  measure  similar 
polari metric  properties  of  the  candidate  region  of 
interest  should  not  be  used  together  in  the  same 
discrimination  algorithm.  In  fact,  using  similar  fea¬ 
tures  is  likely  to  make  discrimination  performance 
worse.  This  fact  is  a  consequence  of  the  likely  occur¬ 
rence  that  the  target-training  dataset  will  differ  some¬ 
what  front  the  target-testing  dataset.  This  phenom¬ 
enon  is  covered  more  fully  in  the  sidebar  entitled 
“Adding  Features  Can  Degrade  Performance.” 

Another  desirable  property  of  the  features  in  a 
discrimination  algorithm  is  that  they  should  be  ro¬ 
bust  in  a  number  of  ways.  Many  feature  algorithms 
require  thresholds  to  be  set  to  isolate  the  brightest 
scatterers,  for  example,  or  to  isolate  the  scatterers  with 
the  most  contrast.  The  feature  values  should  not  be 
too  sensitive  to  the  settings  of  these  thresholds,  be¬ 
cause  they  may  then  work  in  one  deployment  situa¬ 
tion  but  not  in  another  similar  situation.  The  features 
should  also  be  somewhat  robust  to  countermeasures; 
radar  signatures  of  military  vehicles  are  frequently 
altered  by  any  number  of  methods.  Some  common 
methods  include  placing  foliage  and  mud  on  the 
vehicle,  adding  metal  parts  to  the  vehicle,  deploying 
camouflage  netting  around  and  on  top  of  the  vehicle, 
coating  the  target  with  radar-absorbing  material,  or 
simply  opening  the  hatches  of  an  armored  target. 
An  effective  discrimination  feature  would  ideally  be 
insensitive  to  these  methods  and  to  other  types  of 
countermeasures. 

We  examined  fifteen  features  for  use  in  the  L.incoln 
Laboratory  target  discrimination  algorithm;  three  of 
these  features  were  developed  at  Lincoln  Laboratory 
16],  nine  were  developed  at  the  Flnvironmental  Re¬ 
search  Institute  of  Michigan  (ERIM)  of  Ann  Arbor, 
Michigan,  two  were  developed  by  Rockwell  Interna¬ 
tional  Corporation  of  El  Segundo,  California,  and 
one  was  developed  by  Loral  Defense  Systems  of 
Goodyear,  Arizona.  The  non-Lincoln  I.aboratory  fea¬ 
tures  were  developed  under  the  Strategic  Target  Algo¬ 
rithm  Research  (STAR)  contract,  a  yearlong  research 
contract  funded  jointly  by  the  Advanced  Research 
Projects  Agency  (ARPA)  and  the  United  States  Air 
Force.  This  contract  was  administered  by  Lincoln 


Laboratory;  the  goal  was  to  develop  and  test  target 
detection  algorithms  by  using  a  common  dataset  pro¬ 
vided  by  Lincoln  Laboratory.  Fateh  contractor's  ap¬ 
proach  to  the  target-detection  problem  was  some¬ 
what  different,  but  all  used  a  number  of  features.  We 
chose  the  most  promising  features  for  evaluation  in 
this  study. 

Lincoln  Laboratory  Discrimination  Features 

The  three  Lincoln  Laboratory  discrimination  features 
are  standard  deviation,  fractal  dimension,  and 
weighted-rank  fill  ratio.  They  were  developed  by  Leslie 
Novak,  Michael  Burl,  and  Gregory  Owirka,  all  of 
Lincoln  Laboratory.  The  features  are  computed  from 
target-sized  areas  that  are  centered  over  the  pixels 
identified  by  the  prescreener  for  further  processing. 
The  extent  of  the  target-sized  area  is  determined  by 
the  a  priori  knowledge  of  what  type  of  target  is  being 
sought,  and  a  box-spinning  algorithm  is  used  to  de¬ 
termine  target  orientation  [7]. 

The  standard-deviation  feature  is  computed  from 
the  typical  estimator  for  the  standard  deviation.  It 
uses  the  power  (expressed  in  dB)  of  all  the  pixels  in  a 
target-sized  box. 

The  fractal-dimension  feature,  which  is  illustrated 
in  Figure  3,  provides  a  measure  of  the  spatial  dimen¬ 
sionality  of  the  potential  target  [6],  This  feature  esti¬ 
mates  the  Hausdorff  dimension  of  the  spatial  distri¬ 
bution  of  the  top  N scatterers  in  the  region  of  interest. 
For  example,  a  straight  line  has  a  Hausdorff  dimen¬ 
sion  of  one,  and  a  solid  rectangle  has  a  Hausdorff 
dimension  of  two.  Various  other  space-filling  objects 
with  holes  have  a  Hausdorff  dimension  that  falls  be¬ 
tween  one  and  two.  An  isolated  point  has  a  Hausdorff 
dimension  of  zero. 

To  compute  the  fractal-dimension  feature,  we 
threshold  the  region  of  interest  by  taking  only  the  top 
N  scatterers  in  terms  of  power.  A  binary  image  is 
created  from  these  scatterers,  and  the  minimum  num¬ 
ber  of  1-pixel-by-l -pixel  boxes  ( d]  =  1)  that  cover 
all  /V  scatterers  is  determined.  This  number,  of  course, 
is  equal  to  the  value  N.  Then  the  minimum  number 
//,  of  2-pixeI-by-2-pixel  boxes  (zA  =  2)  that  cover  all  N 
scatterers  is  determined.  This  number  is  less  than  or 
equal  to  N.  If  the  spatial  distribution  of  the  scatterers 
is  highly  diffuse,  the  value  of  will  be  close  to  N;  if 


■silVBI R  •  1  ;m  !  V.|t=  \  i  sao*tAlim» 


.MM  i !  V f 


27 


•  KRH  I  HI  N  KJ  Al  . 

/  >/u  rnnnumn^  fu“>i  (  imu  > 


ADDING  FEATURES  CAN  DEGRADE 
PERFORMANCE 


in  the  main  article,  a  theoreti¬ 
cal  analysis  ol  the  one-class  qua¬ 
dratic  discriminator  shows  theo¬ 
retical  expressions  for  the 
probability  of  detection  ( P(j)  and 
the  probability  of  false  alarm  (/V) 
of  the  algorithm.  We  make  the 
claim  in  the  section  on  discrimi¬ 
nation  features  that  adding  fea¬ 
tures  does  not  necessarily  improve 
discrimination  performance.  We 
show  here  that  this  is  indeed  the 
case  by  giving  two  examples;  the 
first  example  shows  discrimina¬ 
tion  performance  with  the  set  of 
five  best  features,  and  the  second 
example  shows  discrimination 
performance  with  the  same  set  of 
features  in  addition  to  seven  oth¬ 
er  features. 

The  idea  that  adding  features 
can  degrade  performance  is,  per¬ 
haps,  counterintuitive.  In  fact,  we 
cannot  degrade  performance  by 
adding  features  if  a  few  key  con¬ 
ditions  are  met.  These  conditions 
are  ( 1 )  the  real  data  obey  perfect¬ 
ly  the  multivariate  Gaussian  as¬ 
sumption  made  in  : he  section  en¬ 
titled  “Theoretical  Analysis  of  the 
One-Class  Quadratic  Discrimina¬ 
tion  Algorithm,”  and  (2)  the  tar- 
get-training  data  and  the  target¬ 
testing  data  have  exactly  the  same 
statistical  distribution. 

The  multivariate  Gaussian  con¬ 
dition  is  difficult  to  verify  for  the 
real  data,  especially  when  a  num¬ 
ber  of  features  are  being  consid¬ 


ered  (see  the  section  entitled 
“Confirming  the  Gaussian  As¬ 
sumption”).  Small  departures 
from  Gaussian ity,  however,  are 
probably  not  the  major  cause  ot 
the  phenomenon  we  are  address¬ 
ing  here.  Instead,  the  major  cause 
of  degraded  performance  is  that 
the  second  condition  is  not 
being  met. 

Figure  A  shows  performance 
curves  for  the  1-ft,  polarimetric 
whitening  filter  (PWF)  data  us¬ 
ing  the  features  described  in  the 
section  entitled  “Best  Features  for 
Discrimination.”  This  figure, 
which  is  the  same  diagram  as  Fig¬ 
ure  17,  also  shows  performance 
curves  for  the  same  dataset  using 


all  twelve  Lincoln  Laboratory  and 
ERIM  STAR  discrimination  fea¬ 
tures.  Notice  the  significant  per¬ 
formance  degradation  that  occurs 
when  the  seven  extra  features  are 
added.  Notes  on  interpreting  this 
type  of  graph  can  be  found  in  the 
sidebar  entitled  “Interpreting 
Plots  of  Prj  versus  FA/krn".” 

Figure  B  is  a  notional  diagram 
that  illustrates  the  degradation 
phenomenon.  The  diagram  is 
complicated  but  the  explanation 
of  it  is  relatively  easy.  There  are 
three  distinct  sets  of  data  displayed 
in  the  diagram:  target  training, 
target  testing,  and  clutter  false 
alarm.  Each  dataset  is  displayed 
for  two  features,  which  we  call 


FA/km2 


FIGURE  A.  Performance  curves  comparing  discrimination  performance  for 
the  five  best  features  and  for  all  twelve  features.  Performance  degrades 
when  more  features  are  added. 


28 


•  KRHTHtN  H  At . 

I  )iseriHitu.iti>i<(  l.irgeh  from  (  hitter 


Feature  1  and  Feature  2.  Imagine 
that  the  target-training  data  and 
the  target-testing  data  are  the  same 
(training  and  testing  is  done  on 
the  same  data,  so  ignore  the  red 
points  on  the  diagram  tor  the  time 
being). 

First,  assume  that  the  discrim¬ 
ination  algorithm  uses  only  Fea¬ 
ture  I  (refer  to  only  the  green  and 
black  points  plotted  along  the  ab¬ 
scissa  tor  this  case).  We  see  that 
the  discriminator  does  a  good  job 
of  separating  targets  from  clutter 
by  using  Threshold  A  (three  false 
alarms  are  called  targets).  In  this 
case  the  two  key  conditions  are 
satisfied:  the  target-trainin"  data 
and  target  testing  data  have  ex¬ 


actly  the  same  statistical  distribu¬ 
tion,  and  the  data  obey  (more  or 
less)  the  Gaussian  assumption 
(within  each  data  type). 

Now,  while  still  assuming  the 
target-training  data  and  target¬ 
testing  data  are  the  same,  we  add 
Feature  2  (refer  to  only  the  green 
and  black  points  in  the  middle  of 
the  graph).  We  can  then  draw 
Threshold  B  as  an  ellipse  around 
the  data,  and  do  a  still  better  job 
at  separating  targets  from  clutter 
(in  this  case  one  false  alarm  is 
called  a  target).  The  two  key  cri¬ 
teria  are  still  obeyed. 

If  we  assume  now  that  the  tar¬ 
get-training  data  and  the  target¬ 
testing  data  are  different  datasets, 


•  Target  testing 

•  Target  training 

•  Clutter  false  alarms 


CVJ 

a> 


•  •• 
•  • 

•  • 


•  • 
•  • 


Tlr.  -  hold  B 


Threshold  A 


Feature  1 


FIGURE  B.  Notional  diagram  of  reason  for  performance  degradation  when 
features  are  added.  This  graph  shows  two  features  in  a  feature  space. 


we  remove  the  second  key  condi¬ 
tion  mentioned  above.  First,  we 
assume  that  the  discrimination  al¬ 
gorithm  uses  only  Feature  1  (re¬ 
fer  to  the  green,  red,  and  black 
points  plotted  along  the  abscis¬ 
sa).  By  drawing  a  threshold  at  the 
point  labeled  Threshold  A,  we  see 
that  the  discrimination  perfor¬ 
mance  is  equivalent  to  the  previ¬ 
ous  case  because  the  target-train¬ 
ing  data  and  the  target-testing 
data  for  Feature  1  have  similar 
characteristics. 

Now,  we  add  Feature  2.  Per¬ 
formance  is  severely  degraded  be¬ 
cause  we  cannot  draw  an  ellipse 
(with  the  same  orientation  as 
Threshold  B)  around  the  center 
of  mass  of  the  green  points  that 
does  not  engulf  large  numbers  of 
clutter  false  alarms  (black  points) 
while  still  engulfing  the  target¬ 
testing  data  (red  points). 

The  important  point  is  that 
any  threshold  ellipse  must  have 
the  same  orientation  as  the  ellipse 
shown  as  Threshold  B,  because 
the  orientation  of  the  ellipse  is 
determined  by  the  statistical  char¬ 
acteristics  of  the  target-training 
data.  This  example  is  a  particu¬ 
larly  egregious  illustration  of 
the  failure  of  the  second  key 
condition,  because  the  target¬ 
training  data  and  the  target-test¬ 
ing  data  now  have  different 
statistical  characteristics.  The  ad¬ 
dition  of  Feature  2  obviously  de¬ 
grades  performance  severely.  More 
subtle  cases  that  significantly 
affect  performance  occur  more 
frequently. 


2  9 


•  KRtrmtN  n  ai.. 

/  )lHTIHll>hltlllg  liirgfti  fii'hi  (  llllhl 


FIGURE 3.  Calculation  of  the  fractal-dimension  feature,  which  measures  the  spatial  bunching  of  the  brightest  pixels  in 
a  region  of  interest,  (a)  The  brightest  pixels  for  a  tree  tend  to  be  widely  separated,  which  requires  a  relatively  large 
number  of  covering  boxes  and  produces  a  low  value  for  the  fractal  dimension,  (b)  The  brightest  pixels  for  a  target  tend 
to  be  closely  bunched,  which  requires  fewer  covering  boxes  and  produces  a  high  value  for  the  fractal  dimension. 


the  scarrerers  are  spatially  hunched  the  value  of  >h  will 
be  considerably  less  than  N.  These  values  are  deter¬ 
mined  for  the  two  specific  examples  in  Figure  3  and 
plotted  in  Figure  4,  with  the  logarithm  of  w,  and 
on  the  ordinate,  and  the  logarithm  of  d\  and  di  on  the 
abscissa.  The  negative  slope  of  the  line  through  the 
two  points,  which  is  given  by 

H  m  !og»i  -  log  rr  i 

log  dz  -  log  z/,  ^  ^ 

is  an  estimate  of  the  Hausdorff  dimension  of  the 
region  of  interest.  For  the  high-resolution  data  in  this 
article,  we  used  N  =  50. 

The  weighted-rank  fill-ratio  feature  is  computed 
from  the  top  /Vscatterers  in  the  target-sized  box.  The 
feature  is  computed  by  totaling  the  power  in  the  top 
N  pixels  within  the  target-sized  box,  and  normalizing 
bv  the  total  power  of  all  pixels  in  the  box.  This  feature 
attempts  to  exploit  the  fact  that  power  returns  from 


log10d 


FIGURE  4.  An  estimate  of  the  Hausdorff  dimension  of  the 
tree  and  target  in  Figure  3.  For  both  objects  the  total  num¬ 
ber  of  scatterers  is  50;  for  the  tree  the  minimum  number  of 
covering  boxes  is  41  and  for  the  target  the  minimum  num¬ 
ber  of  covering  boxes  is  20.  The  negative  slope  of  the  line 
for  each  object  is  the  estimate  of  the  fractal-dimension 
feature.  Targets  tend  to  have  higher  fractal  dimensions 
than  natural  clutter. 


•  KREITHEN  ET  AL. 

Discriminating  Targets  from  Clutter 


most  targets  tend  to  be  concentrated  in  a  few  bright 
scatterers,  whereas  power  returns  from  natural-clutter 
false  alarms  tend  to  be  more  diffuse.  This  feature 
measures  a  power-related  property  of  the  target-sized 
box,  which  makes  this  feature  different  from  the  fractal- 
dimension  feature,  which  measures  a  spatial  property 
of  the  entire  region  of  interest. 

ERIM  Discrimination  Features 

The  ER1M  discrimination  features  were  developed 
and  provided  to  Lincoln  Laboratory  under  the  STAR 
contract  mentioned  above.  They  were  modified  for 
this  study  by  altering  the  thresholds  to  account  for  a 
target  dataset  that  was  substantially  different  from  the 
dataset  used  for  the  STAR  contract.  Instead  of  using  a 
target-sized  box  as  a  preliminary  step,  as  in  the  Lin¬ 
coln  Laboratory  feature  algorithms,  the  ERIM  feature 
algorithms  compute  a  target-shaped  blob  by  perform¬ 
ing  morphological  operations.  These  operations  serve 
both  as  a  method  of  grouping  spatially  related  hits 
from  the  prescreener  and  as  a  method  of  estimating 
the  size,  shape,  and  orientation  of  the  supposed 
target. 

There  are  three  categories  of  ERIM  discrimination 
features:  size-related  features,  contrast-based  features, 
and  polarimetric  features.  Each  of  these  three  catego¬ 
ries  contains  three  features.  The  size-related  features 
are  mass,  diameter,  and  square-normalized  rotational 
inertia.  The  contrast-based  features  are  maximum  con¬ 
stant  false-alarm  rate  (CFAR)  statistic,  mean  CFAR 
statistic,  and  percent  bright  CFAR  statistic.  The  pola¬ 
rimetric  features  are  percent  pure,  percent  pure  even, 
and  percent  bright  even.  We  describe  each  feature  in 
detail  in  the  following  paragraphs. 

The  three  size-related  features  utilize  only  the  bi¬ 
nary  image  created  by  the  morphological  operations. 
The  mass  feature  is  computed  by  counting  the  num¬ 
ber  of  pixels  in  the  morphological  blob.  The  diameter 
is  the  length  of  the  diagonal  of  the  smallest  rectangle 
(either  horizontally  oriented  or  vertically  oriented) 
that  encloses  the  blob.  The  square-normalized  rota¬ 
tional  inertia  is  the  second  mechanical  moment  of  the 
blob  around  its  center  of  mass,  normalized  by  the 
inertia  of  an  equal  mass  square. 

The  contrast-based  features  are  determined  by  a 
CFAR  algorithm.  This  algorithm  can  be  described  by 


x  “  A 


where  x  represents  the  test  pixel,  and  jtt  and  o  are 
estimates  of  the  local  mean  and  local  standard  devia¬ 
tion,  respectively,  of  the  surrounding  clutter.  The  esti¬ 
mates  of  the  parameters  from  the  surrounding  clutter 
are  accomplished  by  using  the  pixels  in  a  window 
around  the  supposed  target  whose  opening  is  large 
enough  to  exclude  the  target  return.  Figure  5  illus¬ 
trates  this  window,  and  the  opening  is  called  the  guard 
area. 

The  CFAR  statistic  given  by  Equation  2  is  com¬ 
puted  for  each  pixel  to  create  a  CFAR  image.  The 
maximum  CFAR  feature  is  the  maximum  value  in  the 
CFAR  image  contained  within  the  target-shaped  blob. 
This  quantity  is  similar  to  the  basic  feature  used  in 
the  prescreener  algorithm.  The  mean  CFAR  feature  is 
the  average  of  the  CFAR  image  taken  over  the  target- 
shaped  blob.  The  percent  bright  CFAR  feature  is  the 
percentage  of  pixels  within  the  target-shaped  blob 
that  exceed  a  certain  CFAR  value. 

The  polarimetric  discrimination  features  are  based 
on  a  transformation  of  the  linear  polarization  basis  in 
which  the  Lincoln  Laboratory  MMW  SAR  gathers 
data  to  an  even-bounce,  odd-bounce  basis  described 
by  the  equations 


HH  +  VV 


=  2|RL|2 


and 

Ihh  -  vvl2  . 

foven  =  J - - - L  +  2|HV|‘ 

=  |rr|2  +  |ll|2  . 

The  odd-bounce  channel  given  by  the  first  equation 
corresponds  to  the  radar  return  from  a  flat  plate  or  a 
trihedral;  the  even-bounce  channel  corresponds  to 
the  radar  return  from  a  dihedral.  Figure  6  illustrates 
examples  of  these  reflectors,  along  with  notional  dia¬ 
grams  of  how  they  reflect  the  radar  energy.  The  use- 


•  KKIIIIIIN  H  AI. 

/  >I>i  i  IHWhlllHU  tdl'Uih  fluid  i  III  I  in 


ttttttm  nrTTTTTTl 

— 

Guard  area 

— 

— 

— 

— 

— 

— 

Test  cell  Target 

— 

1 1  ti  t  n  tt~  Mill  rr 

FIGURE  5.  CFAR  template,  showing  the  pixel  under  test, 
and  the  surrounding  window  of  pixels  from  which  clutter 
estimates  are  computed.  The  test  pixel  and  clutter  window 
are  separated  by  a  guard  area,  which  protects  the  clutter 
estimates  from  being  corrupted  by  portions  of  the  target 
return. 

fulness  of  these  polarimetric  feature  resides  in  the  fact 
that  few  dihedral  structures  exist  in  natural  clutter, 
hut  these  structures  are  plentiful  on  most  man-made 
targets.  Natural  clutter  tends  to  exhibit  more  odd- 
bounce  reflected  energy  than  even-bounce  reflected 
energy. 

The  ER1M  polarimetric  features  are  formed  from 
the  even-bounce  and  the  odd-bounce  images.  The 
percent-pure  feature  is  the  fraction  of  pixels  within 
the  target-shaped  blob  for  which  at  least  a  certain 
fraction  of  the  scattered  energy  falls  in  either  the 
even-bounce  channel  or  the  odd-bounce  channel.  Per¬ 
cent  even  is  the  fraction  of  pixels  within  the  target¬ 
shaped  blob  for  which  at  least  a  certain  fraction  of  the 
scattered  energy  falls  in  the  even-bounce  channel. 
I  he  percent-bright-even  feature  is  the  fraction  of 
pixels  that  exceed  a  certain  value  in  the  CFAR  image 
described  above,  and  which  are  mainly  even-bounce 
scatterers. 

The  main  impetus  for  these  features  is  that  a  man¬ 
made  object  exhibits  approximately  equal  amounts  of 
pure  even-bounce  energy  and  odd-bounce  energy, 
whereas  a  natural-clutter  false  alarm  is  more  likelv  to 


Flat  plate 


Dihedral 


(b) 

FIGURE  6.  Reflection  of  radar  signal  from  a  variety  of  re¬ 
flectors.  (a)  Odd-bounce  reflectors  include  a  flat  plate  and 
a  trihedral,  (b)  Even-bounce  reflectors  include  a  dihedral. 
Radar  backscatter  from  natural  clutter  is  predominantly 
odd  bounce,  while  backscatter  from  man-made  objects  is 
typically  an  equal  mixture  of  even  bounce  and  odd  bounce. 

exhibit  large  amounts  of  pure  odd-bounce  energy. 
Also,  man-made  objects  are  more  likeh  to  exhibit  an 
equal  mixture  of  even-bounce  and  odd-bounce 
energy  than  a  natural-clutter  false  alarm. 

Rockwell  Ihscrnmmitum  f  eatures 

The  Rockwell  discrimination  features  were  also  devel¬ 
oped  and  provided  to  Lincoln  Laboratory  under  the 
STAR  contract.  Like  the  F.R1M  discrimination  fea¬ 
tures,  they  were  modified  to  account  for  the  different 
type  of  target  data  used  in  the  present  study.  These 


n 


•  KREITHtN  t !  A L. 

l)in  rimintinng  I itrgi'ts  from  (  luin  r 


features  use  the  pixels  in  a  target-sized  box  tor  feature 
calculation;  the  algorithm  used  to  determine  the  ori¬ 
entation  of  the  target-sized  box  is  the  same  as  that 
used  tor  computing  the  orientation  for  the  L.incoln 
l  aboratory  discrimination  features  described  earlier. 
Some  of  the  Rockwell  discrimination  features  are  simi¬ 
lar  to  those  already  used  bv  Lincoln  Laboratory  and 
ER1M.  1  hese  similar  features  were  not  considered. 
Instead  we  concentrated  on  rw'o  other  Rockwell  fea¬ 
tures:  (1)  the  poiarimetric  phase  ratio  feature,  and  (2) 
the  specific-entropy  feature. 

The  poiarimetric  phase  ratio  feature  is  another 
attempt  to  exploit  differences  in  polarization  between 
radar  returns  from  targets  and  radar  returns  front 
clutter.  The  relative  phase  between  the  HH  polariza¬ 
tion  channel  and  the  VV  polarization  channel  is  used 
for  this  purpose.  Only  pixels  in  a  target-sized  box  (the 
orientation  of  which  is  determined  by  the  box-spin¬ 
ning  algorithm  described  in  Reference  7)  are  exam¬ 
ined.  In  addition,  to  eliminate  the  low-return  pixels 
that  may  have  a  random  phase  due  to  corruption  by 
receiver  noise,  we  use  only  the  pixels  that  exceed  a 
threshold  in  both  the  HH  polarization  channel  and 
the  VV  polarization  channel.  This  threshold,  which  is 
set  to  a  percentage  of  the  maximum  power  in  a  pixel 
in  a  given  image,  is  set  to  a  low  value  so  the 
thresholding  operation  eliminates  only  the  lowest  re¬ 
turn  pixels  whose  phase  is  most  likely  to  be  corrupted 
by  receiver  noise. 


Relative  phase  (degrees) 


FIGURE  7.  Histogram  of  relative  phase  between  the  HH 
polarization  channel  and  the  VV  polarization  channel.  This 
type  of  plot  is  used  in  the  calculation  of  the  Rockwell 
poiarimetric  phase  ratio  feature. 


1  he  relative  phase  between  the  HH  polarization 
channel  and  the  VV  polarization  channel  is  then  cal¬ 
culated  for  the  remaining  pixels  within  the  target- 
sized  box.  The  values  are  arranged  in  a  histogram 
plot,  such  as  the  example  shown  in  Figure  7,  and  the 
feature  is  calculated  from  this  plot.  The  poiarimetric 
phase  ratio  is  defined  as  the  number  of  pixels  to  fall 
within  ±.v°  of  180°  relative  phase  on  the  histogram, 
divided  by  the  number  of  pixels  that  fall  within  ±.v°  of 
0°  relative  phase  on  the  histogram.  We  used  a  value  of 
x  =  90°. 

Specific  entropy  is  the  second  Rockwell  discrimi¬ 
nation  fc.ti  re  used  in  this  studv.  Because  of  the  com¬ 
plicated  .  .  rinition  of  this  feature,  it  was  not  clear 
which  step  in  the  calculation  provides  the  ability  to 
separate  targets  from  natural-clutter  false  alarms.  To 
understand  this  feature  better,  we  investigated  it  in 
considerable  detail.  A  number  of  steps  are  involved  in 
computing  this  feature: 

1.  Choose  a  threshold  /  that  is  set  to  the  quantity 
corresponding  to  the  98th  percentile  of  the  surround¬ 
ing  clutter,  and  calculate  a  normalized  amplitude  by 


a ,  =  max(/»;  -  7,0) , 

where  a  is  the  amplitude  (in  dB)  above  the  threshold, 
p  is  the  amplitude  of  the  original  pixel  (in  dB),  i  is  the 
pixel  tag  number  (of  which  there  are  /«,  which  is  the 
number  of  pixels  in  the  target-sized  box),  and  7"  is  the 
value  (in  dB)  of  the  threshold. 

2.  Normalize  the  amplitude  by 


f  -  — -  . 

J>  m 

(=i 

unless  at  -  0  for  all  /  =  1,... ,  m,  in  which  case  the 
specific-entropy  feature  is  set  to  zero. 

3.  Compute  the  specific-entropy  feature  by 


rri 

specific  entropy  = 

;=l 


f,  jgg  f 

log  m 


The  idea  behind  this  feature  is  to  exploit  two  sup¬ 
posed  properties  of  a  target:  (1)  the  pixels  exceeding 
the  threshold  7  do  not  vary  greatly  in  amplitude  for  a 


•  KJttI  J  HFN  KI  Al. 

[yin  vihiniiiinio  Lugets  from  (  Inner 

target,  hut  they  do  vary  greatly  for  a  natural-clutter 
false  alarm;  and  (2)  more  pixels  exceed  the  threshold 
/  for  a  target  than  for  a  natural-clutter  false  alarm. 

The  question  that  remains  is  which  step  in  the  feature 
calculation  provides  the  separation  between  targets 
and  natural-clutter  false  alarms. 

We  studied  this  problem  by  separating  the  calcula¬ 
tion  of  the  specific  entropy  into  the  steps  described 
above,  and  then  we  calculated  a  feature  based  only  on 
the  operation  in  each  separate  step.  To  this  end,  we 
invented  a  simple  count  feature,  which  counts  the 
number  of  pixels  that  exceeded  the  threshold  7 as  it 
was  calculated  above,  and  normalizes  this  value  by  the 
total  possible  number  of  pixels  in  a  target-sized  box. 

This  procedure  was  done  for  targets  and  tor  clutter 
false  alarms,  and  the  count  feature  was  then  plotted  as 
a  scatterplot  versus  the  specific-entropy  feature,  as 
shown  in  Figure  8  for  a  sample  target  dataset. 

In  this  plot,  specific  entropy  is  plotted  on  the 

1.0 


0.8 


£  0.6 
o 

-4—* 

c 

<D 

O 

h— 

u 

CD 

Q. 

oo  0.4 


0.2 


0.0 

0.0001  0.001  0.01  0.1  1.0 

Count  feature 

FIGURE  8.  Scatterpfot  of  the  Rockwell  specific-entropy  feature  versus  count  feature.  Points  falling  on  a  straight 
line  indicate  a  high  correlation  between  the  two  features. 

Vll: M.iVRFH  '  '  '«•«;< 


ordinate,  and  the  logarithm  of  the  count  feature  is 
plotted  on  the  abscissa,  if  the  two  features  are  highly 
correlated,  the  plot  shows  the  points  falling  along  a 
straight  line,  which  is  indeed  the  case  iu.  this  ex¬ 
ample.  In  fact,  all  the  target  and  clutter  talse-alarm 
datasets  that  we  examined  showed  similar  scatterplots. 
For  practical  purposes,  this  scatterplot  indicates  that 
the  count  feature  and  the  specific-entropy  feature  are 
equivalent,  t  he  extra  steps  given  above  in  the  calcula¬ 
tion  of  the  specific-entropy  feature  do  little  to 
increase  the  separation  between  targets  and  natural- 
clutter  false  alarms. 

Loral  Discrimination  Feature 

l.oral  Defense  Systems  was  the  third  participant  in 
the  STAR  contract.  Many  of  their  discrimination 
features  substantially  overlapped  the  features  of  the 
other  two  contractors  and  Lincoln  Laboratory.  For 
this  reason,  we  examined  only  one  Loral  discrimina- 


34 


’«■  .ill:.  R',; . 


•KREITH.  IETAL. 

Diyerimniunug  !  urge  t>  from  (  hitler 


tion  feature — the  contiguousness  feature. 

bo  calculate  this  feature,  we  segment  each  image 
into  three  separate  images  based  on  the  amplitude  of 
individual  pixels,  as  shown  in  higure  9.  All  the  pixels 
in  an  individual  image  are  histogrammed  and  then 
divided  into  three  categories,  1  he  lowest  25%  of  pix¬ 
els  are  called  shadow  (Level  I),  the  middle  60%  arc- 
called  background  (Level  2),  and  the  top  15%  arc- 
called  the  target  region  (Level  3).  Lor  the  purposes  of 
this  study,  we  modified  the  procedure  bv  applying  the 
thresholding  process  only  to  each  region  of  interest. 


(a) 


(b) 

FIGURE  9.  Concept  of  contiguousness  feature,  (a)  The 
target  within  the  region  of  interest  has  an  irregular  shape. 
The  radar  illuminates  this  shape  from  the  top,  which  causes 
a  shadow  to  extend  downward  in  the  image,  (b)  The  two 
thresholds  in  the  histogram  of  pixel  power  (in  dB)  divide 
the  region  of  interest  into  three  categories:  shadow,  back¬ 
ground,  and  target  region.  The  Loral  contiguousness  fea¬ 
ture  is  computed  by  first  forming  six  separate  regions  of 
interest  based  on  these  categories. 


which  consists  of  a  target-sized  area  plus  a  border 
containing  the  surrounding  clutter,  bach  region  of 
interest  was  then  segmented  into  the  three  categories 
by  using  only  the  clutter  area  surrounding  each  target 
candidate  to  determine  the  threshold  levels. 

The  thresholding  procedure  effectively  creates  three- 
regions  of  interest:  one  that  contains  the  brightest 
pixels  (called  the  Level  3  image),  one  that  contains 
the  dimmest  pixels  (called  the  l  evel  1  image),  and 
one  that  contains  the  midlevel  pixels  (called  the  Level 
2  image).  The  same  thresholding  procedure  is  per¬ 
formed  on  the  (.'FAR  image,  which  is  derived  in  the 
same  way  as  the  CHAR  image  described  in  the  section 
on  F.RIM  discrimination  features  and  determined  by 
the  expression  in  liquation  1.  The  contiguousness 
feature  is  determined  by  computing  numbers  from 
each  of  these  six  regions  of  interest. 

The  computation  of  the  contiguousness  feature  is 
straightforward,  bach  contiguousness  number  is  de¬ 
rived  from  only  one  image  (i.e.,  the  Level  1  ( TAR 
image,  the  Level  2  (TAR  image,  the  Level  3  ( TAR 
image,  the  Level  1  image,  the  Level  2  image,  or  the 
Level  3  image),  bach  pixel  included  in  each  particular 
image  is  counted,  and  its  immediate  neighbors  that 
appear  in  the  same  image  are  counted  as  well.  The 
count  is  then  normalized  by  the  total  number  of  pos¬ 
sibilities  that  could  have  occurred  (which  is  nine  times 
the  number  of  pixels  in  the  image).  The  final  number, 
which  has  a  value  between  zero  and  one,  is  the  con¬ 
tiguousness  number  for  that  image.  This  operation  is 
done  for  every  image,  so  the  contiguousness  feature- 
gives  six  separate  numbers  for  each  region  of  interest. 

Discrimination  Algorithm 

As  mentioned  at  the  beginning  of  the  section  on 
algorithm  description,  the  Lincoln  Laboratory  dis¬ 
crimination  algorithm  is  based  on  a  one-class  qua¬ 
dratic  discrimination  algorithm,  the  inputs  of  which 
are  the  feature  vectors  for  each  candidate  region  of 
interest.  The  algorithm  is  trained  beforehand  with 
representative  target  data  only  (no  clutter  data  arc- 
used  for  the  training,  hence  the  one-class  algorithm). 
Often  these  target  data  consist  of  images  of  targets 
with  no  countermeasures  applied.  The  tests  performed 
for  this  article  use  this  training  method. 

I  he  reasoning  behind  this  type  of  training  is  that 

MiVRfR  '  '  ■«'* i  !nl  i  i\hi;»,  .  aSJiiRAlNJf,  .-(I:. 


id  I  l.Vf  6 


35 


•  KRti  mtN  t  I  Al  . 

/  >in  rihiuhitiHg  Lngi  h  frith!  (  (utter 


predicting  the  use  or  modification  ot  enemy  targets  is 
impossible,  so  training  on  exactly  the  same  types  ot 
targets  that  will  be  encountered  in  a  real  situation  is 
also  impossible.  Therefore,  data  gathered  bv  using 
non-countermeasured  targets  seem  a  reasonable  choice 
tor  a  target-training  dataset.  Any  realistic  test  ot  the 
discrimination  algorithm,  however,  must  include  tar¬ 
gets  that  have  some  countermeasures  applied. 

theoretical  Analysis  of  the  One-Class  Quadratic 
Discrimination  Algorithm 

The  one-class  quadratic  discrimination  algorithm  used 
in  the  L  incoln  Laboratory  multistage  target-detection 
algorithm  can  be  described  mathematically  as 

4  =1(X,  -M,y  S-'(X,  -M„), 

"  (3) 

tor  /  =  1,2 . *,+*,. 


variate  Cijussian  distributed.  Then  the  X,  values  are 
chi-squared  distributed  [8],  and  the  pdt  can  be  writ- 


f„C)  = 


where  h(z)  =  1  and  Y'ar (z)  =  21  a. 

The  distribution  ot  X,  under  the  target  (non-train¬ 
ing)  and  clutter  talse-alarm  classes  with  these  assump¬ 
tions  is  more  difficult  to  calculate.  In  each  ot  the  two 
cases,  a  different  matrix  A  must  be  found  such  that 

a;  s„.a,  =  i 

S,A,  =  L, 


where  n  is  the  number  ot  features  used  in  the  dis¬ 
criminator,  M„.  and  S„.  are  the  estimates  of  the  mean 
vector  and  variance-covariance  matrix  of  the  training 
target  set,  X,  is  a  random  vector  representing  the 
observed  candidate  features,  and  X,  is  a  random  vari¬ 
able  represenring  the  distance  from  the  test  point  to 
the  target-training  class.  The  two  variables  k ,  and  kt 
are  the  number  of  targets  and  the  number  of  clutter 
false  alarms,  respectively,  that  the  discriminator  re¬ 
ceives  from  the  prescreener  stage  of  the  multistage 
algorithm. 

To  analyze  the  discriminator  given  by  Equation  3, 
we  need  to  find  the  quantities 

Prob  jx  <  K  J  /  is  target j  =  If  (4) 

and 

Prob  jx,  <  K  j  /  is  clutterj  =  ,  (5) 

where  K  is  the  hard  threshold.  This  analysis  involves 
finding  the  probability  distribution  function  (pdf)  of 
X,  for  the  target  case  and  for  the  clutter  false-alarm 
case,  and  then  integrating  the  pdf  according  to  Equa¬ 
tions  4  and  5. 

The  distribution  of  X,  for  the  target-training  dataset 
is  easy  to  calculate  if  the  assumption  is  made  that  the 
estimates  ot  M„.  and  S„.  take  on  their  true  values.  For 
tractabilitv,  we  also  assume  that  the  features  are  multi- 


A,'  S„A,  =  , 

A'  S  A,  =  L  , 

where  L,  and  L,  are  diagonal  matrices.  This  operation, 
which  is  a  simultaneous  diagonalization,  reduces  the 
problem  of  evaluating  Equations  4  and  3  to  one  of 
finding  the  distribution  ot 


where 


it 

^  k,AX!  ~  wtJ 


-  y  kaxi  -  ">,j 

n  A-t 


A,  /  =  diag  (L,)/ 

<•>,,/  =  A1,  (M,  -  M, 


A /  =  diag  (L. )/ 

««,,/  =  K  (M,.  -  M,,.)/ 


US  \  :  ARMh’A’nR’'  ,:0\- K’.hL 


!  ''IV*  i=  \ii7Bi  H  ‘ 


36 


•  KRt! THEN  fcT  AI . 

i>Dinmuiuttng  Lirgrts  from  (  Imn  r 


and  where  the  operator  diag(-)  indicates  the  extrac¬ 
tion  of  the  diagonal  vector  from  the  matrix  argument. 
I  he  quantity  n  is  the  number  of  features  used  in  the 
discriminator. 

Calculating  the  distribution  of  Equation  6  or  Equa¬ 
tion  n  without  the  multivariate  Gaussian  assumption 
would  be  difficult  because  the  summation  would  then 
be  over  uncorrelated — but  not  necessarily  indepen¬ 
dent — random  variables.  Once  again,  we  make  an 
assumption  that  the  estimates  of  M;).  and  S„.  take  on 
their  true  values. 

The  characteristic  function  of  the  distribution  of 
Equations  6  and  is  given  bv 


<P„U)  =  exp 


In  this  equation,  j  =  \  - 1  .  W  e  can  omit  the  target  and 
clutter-false-alarm  subscripts  because  the  mathemat¬ 
ics  for  the  two  cases  is  similar.  This  characteristic 
function  can  be  inverted  and  integrated  according  to 
Equations  4  and  5  by  using  Eourier  transform  tlieorv, 
so  that 


uK 


r,i(K)  = 


-r;/ 


cos  (iik'q) 


lH[0,.,;(r/)|  .  (  x 
- sinp/Ar/) 


dq 


and 


uK 


P,,(K)  = 


’)  dz 


B/ 


cos(//AV/) 


- sinu/Ar/) 


dq. 


(  on firming  thr  (niussiun  Assumption 

We  make  a  key  assumption  in  the  theoretical  perfor¬ 
mance  prediction  for  the  discrimination  algorithm 
given  above — we  assume  that  the  datasets  (target  train¬ 
ing,  target  testing,  and  clutter  false  alarm)  are  multi¬ 
variate  Ciaussiat  distributed.  I  here  are  two  tests  that 
are  feasible  to  confirm  this  assumption;  the  first  test  is 
for  the  univariate  case  and  the  second  test  is  for  the 
bivariate  case,  lests  for  higher  dimensions  exist,  but 
they  can  become  complicated  and  difficult  to  inter¬ 
pret  [9]  {or  they  rely  on  making  some  other  crucial 
assumption,  which  can  be  difficult  to  check).  We 
chose  to  perform  the  univariate  and  bivariate  tests,  for 
which  we  give  the  results  here.  We  also  created  a 
scatterplot  tor  a  trivariate  test. 

All  the  tests  described  here  were  done  as  a  check  on 
algorithm  performance  and  not  as  an  end  in  them¬ 
selves.  We  did  not  try  to  calculate  exact  quantitative 
measures  for  goodness  of  fit.  An  exact  studs  would 
have  added  considerable  complexity  to  our  task,  while 
providing  little  insight.  Instead,  our  tests  were  done 
by  using  graphical  techniques  and  the  fits  were  per¬ 
formed  by  eve;  only  approximate  Gaussianity  can  be 
ascertained  by  such  techniques.  The  proof  that  the 
theory  is  an  accurate  predictor  of  performance  is  not 
contained  in  these  tests,  but  rather  in  the  comparison 
of  real  data  results  with  theoretical  results.  This  com¬ 
parison  is  given  in  the  section  entitled  “Real  Data 
versus  I  heoreticai  Performance. 

Ih  e  univariate  test  is  straightforward.  W  e  plot  each 
feature  on  Gaussian-scaled  paper,  and  test  it  bv  exam- 
i1  ng  if  the  cumulative  density  function  is  a  straight 
line.  In  general,  we  found  that  most  of  the  features  for 
most  of  the  datasets  were  adequately  univariate 
Gaussian.  In  the  few  exceptional  cases,  the  distribu¬ 
tions  were  not  far  off,  and  the  discrepancies  were  not 
significant  in  the  final  results.  Eigure  10  is  an  example 
of  a  univariate  test  with  the  fractal-dimension  feature. 

I  he  bivariate  case  is  tested  by  using  scarterplots, 
which  show'  data  points  of  one  feature  versus  another 
feature.  Eor  Gaussianity,  these  points  should  fall  in  an 
ellipsoidal  bunch  around  the  centroid  of  the  data 
points.  I  here  should  be  more  data  points  near  the 
center  of  the  ellipse,  and  fewer  data  points  farther 
from  the  center  of  the  ellipse.  W'e  could  carefully  and 


•  KKH  1  HI  N  I  1  Al  . 

/  hiiWHlIhltlllg  l  froiil  (  ililhl 


FIGURE  10.  Univariate  Gaussian  test  of  the  fractal-dimension  feature.  Any  straight 
line  on  this  graph  represer's  a  Gaussian  curve. 


quantitatively  verity  the  percentage  of  points  within  a 
certain  normalized  radius  of  the  center  of  mass,  which 
would  give  a  measure  of  how  Gaussian  the  data  were 
distributed.  We  did  not,  however,  perform  this  quan¬ 
titative  verification;  we  merely  observed  how  closely 
the  data  points  were  bunched  around  the  center  of 
mass. 

These  graphs  are  also  useful  for  checking  the  corre¬ 
lation  between  two  features;  the  more  linear  the  data 
points  are,  the  higher  the  correlation.  If  the  data 
points  fall  in  an  ellipse  that  is  horizontally  or  verti¬ 
cally  oriented,  then  the  data  points  are  uncorrelated. 
This  test  is  not  just  an  interesting  footnote;  the  sec¬ 
tion  entitled  “Feature  Choice  Guidelines”  describes 
the  importance  of  choosing  features  that  are  orthogo¬ 
nal  (i.e.,  uncorrelated)  for  good  discrimination  per¬ 
formance.  The  scatterplots  can  also  give  additional 
insight  into  the  ability  of  two  features  (taken  simulta¬ 
neously)  to  separate  targets  from  clutter.  Ideally,  we 
would  like  the  target-training  dataset  and  the  target¬ 
testing  dataset  to  be  coincident,  and  the  clutter  false- 
alartn  dataset  to  be  separated  from  the  other  two  by  a 
wide  margin  (measured  both  along  the  abscissa  and 
the  ordinate). 

Figure  1  1  shows  an  example  of  a  scatterplot  for  the 
fractal-dimension  feature  versus  the  weighted-rank 
fill-ratio  feature.  The  target-training,  target-testing. 


natural-clutter  false-alarm,  and  natural-  and  cultural- 
clutter  false-alarm  datasets  are  shown.  The  target¬ 
training  and  target-testing  datasets  seem  to  be  reason¬ 
ably  elliptically  distributed  around  their  centers  of 
mass,  and  they  seem  to  be  close  to  each  other;  both  of 
these  properties  are  desirable.  The  clutter  false-alarm 
datasets  seem  to  be  somewhat  less  elliptical,  but  are 
reasonably  well  separated  from  both  target  datasets. 

The  three-dimensional  scatterplot  shown  in  Figure 
12  illustrates  all  three  Lincoln  Laboratory  discrimina¬ 
tion  features  for  the  target-training,  the  target-testing, 
and  the  natural-clutter  false-alarm  datasets  described 
in  the  section  entitled  “Data  Used.  There  are  two 
things  to  be  noticed  about  this  figure.  First,  the  figure 
clearly  shows  the  separation  between  targets  and  natu¬ 
ral-clutter  false  alarms,  and  it  shows  that  the  clutter 
false  alarms  intermingled  with  the  target  datasets  tend 
to  be  those  created  by  man-made  objects  (i.e.,  cul¬ 
tural  clutter).  Second,  the  figure  helps  confirm  the 
approximate  Gaussianity  of  the  target  datasets  and 
the  natural-clutter  false-alarm  dataset. 

Notice  the  distribution  of  the  red  points  (the  tar¬ 
get-training  dataset)  in  the  figure.  If  these  red  points 
are  Gaussian  distributed,  they  should  form  an  ellip¬ 
soidal  pattern  around  the  center  of  the  red  point 
cloud  with  greater  density  of  points  toward  the  cen¬ 
ter.  Likewise,  the  dark  blue  points  (the  target-testing 


38 


•  KREITHEN  ET  AL. 

Discriminating  large  is  from  (.1  utter 


dataset),  the  green  points  (the  natural-clutter  hilse- 
alarm  dataset),  and  the  light  blue  points  (the  cultural- 
clutter  false-alarm  dataset)  in  the  figure  should  he 
distributed  in  a  similar  manner  tor  Ciaussianity  to 
hold. 

Figure  12  shows  that  the  red  points  and  the  dark 
blue  points  are  distributed  in  an  approximately  ellip¬ 
soidal  pattern  around  their  respective  centers.  The 
green  points,  however,  are  less  ellipsoidal  in  their 
distribution,  and  the  light  blue  points  are  clearly  non- 
ellipsoidal.  As  we  demonstrate  in  a  later  section,  the 
minor  deviation  of  the  green  points  (i.e.,  the  natural- 
clutter  false  alarms)  from  Ciaussianity  does  not  greatly 
affect  the  agreement  between  the  theory  and  the  real 
data.  The  lack  of  Ciaussianity  in  the  light  blue  points 
(i.e.,  the  cultural-clutter  false  alarms)  is  not  critical 
because  the  discriminator  is  designed  to  eliminate  the 


natural-clutter  false  alarms  and  pass  the  cultural-cltu- 
ter  false  alarms  to  the  classification  algorithm. 

The  goal  of  the  discrimination  algorithm,  as  stated 
earlier  in  this  article,  is  to  reject  false  alarms  caused  by 
natural  clutter.  For  most  of  this  article,  we  do  not  dis¬ 
tinguish  between  clutter  false  alarms  caused  bv  natu¬ 
ral  clutter  and  clutter  false  alarms  caused  by  cultural 
clutter,  because  it  is  impossible  to  know,  in  am  kind 
of  realistic  scenario,  which  type  of  clutter  false  alarm  a 
given  region  of  interest  is  (or  even  if  the  region  of 
interest  is  a  clutter  false  alarm  or  a  legitimate  target). 

In  Figure  12  we  separate  the  two  types  of  false 
alarms  for  analysis  purposes.  For  the  discrimination 
algorithm  to  perform  well,  the  targets  must  be  sepa¬ 
rated  from  the  natural-clutter  false  alarms.  Figure  1 2 
shows  that  the  targets  are  indeed  separated  from  the 
natural-clutter  false  alarms  but  not  from  the  cultural- 


•  Stockbridge  natural-clutter  false  alarms  •Target  testing 

•  Stockbridge  natural-  and  cultural-clutter  false  alarms  •Target  training 


FIGURE  11.  Scatterplot  of  fractal-dimension  feature  versus  weighted-rank  fill-ratio  feature.  For  good  discrimination 
performance  with  these  two  features,  the  target  datasets  should  be  separate  from  the  false-alarm  datasets. 


39 


Kit  I  IIIIIMI  \t 


~o 

<D  „  i 

5  o-4 


-  - 

^  ■  .  ft  ■ 


T arget  training 
Target  testing 
Natural  clutter 
Cultural  clutter 


FIGURE  12.  Three  dimonsionnl  scatter  plot  of  the  three  Lincoln  Laboratory  discrimination  features:  fractal 
dimension,  worcjhtod-iank  fill  latro.  and  standard  deviation 


v 1 1 it i cl  f.iKc  .il.imis.  In  nilu  i  wdi. Is.  the  cliscriminu 
lion  ulporithm  docs  \\ s  1 1  in  [lie  roll  to!  which  it  was 
elcsiplk'd. 

(lO.lls 

\  r  I  I  111  port. HU  e,  >,||  lot  ||]  is  slllch  |s  Id  e  house  I  he  best 
set  ol  le. miles  troin  ilk  discrimination  feature  list 
niton  above'  m  the  sec  i ion  mi  disc  runm.it  ion  feat  tires. 
\s  we  ulle.ldv  Stated,  these  te. mile's  .lie  standard  dev  1.1 
non.  true  tal  dimension,  w coined  rank  fill  ratio,  mass, 
eliameie'i.  square  normuli/cd  rotational  inert t a .  mu\i 
mum  (  I  Ms  statistic,  mean  (  I  Ms  statistic,  pels  s  i  u 
bright  (  I  Ms  statistic .  ps  t e e 1 1 1  pure,  perce  nt  ptireeven. 
| 'etc cut  bripllt  eun.  pol.inmciric  phase'  ratio,  specific 
clumps,  and  contimioitsness.  \\  hat  determines  a  best 
si  t  of  the  se  feature's;  Imtialh  we  do  not  know  whether 
ilk  he'st  set  coin. mis  some  combination  of  three  lea 
tine's  or  font  teat  tiles  ot  more,  ot  all  fifteen  features, 
ot  i  sen  if  the  lu  st  sc!  is  composed  of  certain  ot  these 
feature  s  and  not  others. 


An  additional  pul  is  that  the  features  chosen  must 
he  a  robust  set  ol  features.  Ideally,  the  features  should 
work  ei  1 1 1.1 1 1  \  well,  ic'puidless  of  the  t  a  t  Lie' t  deploy 
mein,  tile  couniemie.isiues  used,  ot  the  type  of  dm 
tei  bet  no  imaged.  I  his  invariance  to  d.tt.t  is  impos¬ 
sible  to  achieve',  a  more  realistic  pul  is  lot  ilk  same 
features  to  be  part  of  the  best  feature  set,  regardless  of 
data.  As  a  test  of  this  pul.  we  examine  a  number  ot 
different  datasets. 

Anodic!  pul  is  to  understand  the  operation  of  the 
disc  rimm.it ion  algorithm.  \\  e  would  like  the  theory a i 
cal  expressions  lot  the  performance  ol  the'  one  class 
c|uudi,uic  disc  i  imin.uion  algorithm  to  predict  the  be 
h.tvioi  of  the  discrimination  algorithm  accttratelv.  If 
tlie  theory  is  accurate,  then  we  can  predict  the  perfoi 
malice  ot  the  algorithm  for  different  combinations  of 
features.  A  stratcuv  foi  feature  selection  would  then 
be  to  del  ice  the  parameters  needed  lot  the  theotv 
from  the'  real  data,  compute  the  theoretical  results  tm 
all  combinations  of  features,  and  choose  tlu  best  se  i 


•  KREITHEN  ET  AL. 

Discriminating  I  urge  ts  from  (Jut  ter 


of  features  based  on  these  predictions.  We  shall  see 
that  this  strategy  is  a  reasonable  one. 

We  would  also  like  to  examine  discrimination  re¬ 
sults  tor  different  resolutions  and  different  polariza¬ 
tions.  The  Lincoln  Laboratory  MMW  SAR  has  a 
resolution  of  1  ft  by  1  ft,  but  we  can  easily  construct 
lower-resolution,  single-look  radar  imagery  from  the 
data  at  hand.  Additionally,  the  Lincoln  Laboratory 
MMW  SAR  is  fully  polarimetric  (fully  polarimetric 
SAR  data  allow  synthesis  of  any  polarization  or  com¬ 
bination  of  polarizations).  Many  radar  sensors  are  not 
fully  polarimetric,  and  use  only  a  single  polarization. 
The  most  common  polarization  used  is  HH;  there¬ 
fore,  we  use  HH  data  as  well  as  fully  polarimetric 
data  in  this  study.  We  expect  the  best  feature  set  to 
change,  depending  on  our  choice  of  resolution  and 
polarization. 

Choosing  a  Feature  Set 

The  method  we  use  to  choose  the  best  feature  set  is 
straightforward.  First,  the  data  are  prescreened  by 
using  a  simple  rwo-parameter  CFAR  algorithm  [2, 
10].  This  stage  is  designed  to  eliminate  (with  a  mini¬ 
mum  of  computation)  only  the  most  obviously  non¬ 
targetlike  clutter.  The  prescreening  algorithm  oper¬ 
ates  on  imagery  that  has  already  been  reduced  to  a 
resolution  of  1  m  by  1  m.  The  resolution  was  re¬ 
duced  by  taking  a  noncoherent  average  of  each 
4-pixel-by-4  pixel  non-overlapping  box  (a  pixel  has  a 
nominal  resolution  of  approximately  0.23  m).  This 
method  of  resolution  reduction  has  two  advantages: 
(1 )  it  reduces  the  amount  of  data  we  need  to  process, 
and  (2)  it  reduces  the  speckle  that  is  present  in  the 
high-resolution  SAR  imagery  (the  article  by  Leslie  M. 
Novak  et  al.  in  this  issue  gives  an  explanation  of 
speckle  in  SAR  imagery). 

In  the  prescreener  for  this  study  we  use  a  threshold 
value  that  allows  the  detection  of  80%  of  the  targets. 
This  percentage  was  chosen  for  consistency  among 
datasets;  it  was  also  chosen  by  considering  the  num¬ 
ber  of  clutter  chips  that  are  passed  to  the  discrimina¬ 
tion  stage.  A  higher  probability  of  detection  in  the 
prescreener  stage  necessarily  increases  the  number  of 
clutter  false  alarms  passed  to  the  discriminator.  Com¬ 
putation  time  and  storage  limitations  preclude  using 
a  higher  percentage  value  for  the  prescreener  prob¬ 


ability  of  detection.  The  data  used  in  the  prescreener 
algorithm  were  also  processed  by  using  the  polarimet¬ 
ric  whitening  filter  (PWF)  [1],  which  combines  the 
HH,  HV,  and  VV  polarization  channels  together  in  a 
manner  that  optimally  decreases  speckle.  The  HH 
polarization  results  use  only  the  HH  polarization  SAR 
data,  and  hence  do  not  use  the  PWF  imagery  for  the 
prescreener  algorithm. 

The  candidates  identified  by  the  prescreener  (ei¬ 
ther  on  targets  or  on  clutter  false  alarms)  are  then 
grouped  spatially.  The  grouping  algorithm  is  a  simple 
one:  all  hits  within  a  target-sized  area  are  grouped 
into  a  single  detection.  This  grouping  operation  ex¬ 
ploits  some  of  the  spatial  information  inherent  in  the 
proximity  of  prescreener  hits. 

The  discrimination  algorithm  is  run  on  all  the 
regions  of  interest  selected  by  the  prescreener  and  the 
grouping  algorithm.  First,  all  the  features  described  in 
the  section  on  discrimination  features  are  computed 
for  all  regions  of  interest.  The  features  are  computed 
for  the  following  four  combinations  of  data:  (1)  l -ft 
resolution  and  PWF  polarization,  (2)  1-m  resolution 
and  PWF  polarization,  (3)  1  -ft  resolution  and  HH 
polarization,  and  (4)  1-m  resolution  and  HH  polar¬ 
ization.  The  features  were  originally  tuned  (in  terms 
of  the  thresholds  used  in  the  feature  calculations  them¬ 
selves)  for  the  I  -ft  resolution,  PWF  case.  The  features 
are  used  without  modification  for  this  case  as  well  as 
for  the  1  -ft  resolution,  HH  polarization  case. 

Naturally,  the  polarimetric  features  cannot  be  cal¬ 
culated  for  the  HF)  polarization  case  because  the 
polarimetric  features  use  polarizations  other  than  HH. 
We  therefore  use  a  reduced  set  of  features.  For  the 
1-m  resolution  cases,  we  retune  the  features  by  com¬ 
puting  them  for  a  range  of  thresholds,  and  we  choose 
the  threshold  that  provides  the  best  separation  be¬ 
tween  targets  and  clutter  false  alarms  for  all  datasets. 
This  retuning  is  done  separately  for  the  PWF  case  and 
the  HH  polarization  case.  Therefore,  these  features 
are  intended  to  give  best-case  results.  Any  use  of  these 
tuned  features  in  other  datasets  can  only  approach  the 
results  shown  in  the  article  in  general.  Certainly,  the 
1-m  resolution  tests  provide  a  better  indication  of  the 
performance  of  the  discrimination  algorithm  than  is 
likely  to  be  obtained  in  a  real  situation. 

The  parameters  necessary  for  a  theoretical  evalua- 

VOlllVf  6  MIVBfR  :  'I'):-:  I  Ml  L  [KClL'v  l  AB(1S(  TOR,  41 


•  K RUT  I  i  I  N  U  AL 

l  Lirot  tx  tnnn  (  lutnr 


INTERPRETING  PLOTS  OF  PD  VERSUS  EA/KM2 


THE  METHOD  OF  EVALUATION  for 
the  discrimination  algorithm  de¬ 
scribed  in  this  paper  involves  plot¬ 
ting  a  curve  that  shows  the  prob¬ 
ability  of  detection  (PJ)  versus  the 
number  of  false  alarms  per  square 
kilometer  (FA/ktrf ).  The  measure 
of  FA/km"  scales  directly  to  the 
probability  of  false  alarm,  which 
was  theoretically  derived  for  the 
discrimination  algorithm  in  the 
section  entitled  “Theoretical  Anal¬ 
ysis  of  the  One-Class  Quadratic 
Discrimination  Algorithm.”  Such 
curves  are  often  referred  to  as  re¬ 
ceiver  operating-characteristic 
(  ROC)  curves. 

Figure  A  gives  an  example  of  a 
simple  ROC  curve  (in  red).  Bet¬ 
ter  performance  is  indicated  in 
these  types  of  plots  by  a  curve 
moving  upward  and  leftward.  A 
plot  such  as  this  one  might  be 
used  to  evaluate  the  prescreener 
stage  or  the  discrimination  stage 
separately.  A  more  complicated 
plot  is  necessary  to  evaluate  the 
combination  of  the  prescreener 
and  discrimination  stages. 

Figure  A  also  shows  an  exam¬ 
ple  of  a  plot  (blue  and  red)  that 
might  be  used  to  evaluate  both 
the  prescreener  and  discrimina¬ 
tion  stages  combined.  Notice  that 
the  original  ROC  has  grown  a 


number  of  extra  lines,  or  tails,  for  the  particular  set  of  inputs 
These  extra  lines  (in  blue)  repre-  being  provided  by  the  prescreen- 
sent  the  improved  performance  er  at  that  operating  point, 
provided  by  the  discrimination  As  the  threshold  of  the  pre¬ 
stage  of  the  multistage  target-de-  screener  is  varied,  the  set  of  in¬ 
fection  algorithm.  Each  extra  line  puts  provided  to  the  discrimina- 
meets  the  curve  of  the  original  tion  algorithm  varies  as  well.  The 
prescreener  stage  at  a  certain  evaluation  criterion  for  perfor- 
point.  Each  discrimination  line  mance  in  an  ROC  curve  works 
emanating  from  these  points  de-  here  as  well;  the  line  moving  up- 
scribes  the  operating  characteris-  ward  and  leftward  indicates  bet- 
tic  of  the  discrimination  algorithm  ter  performance. 

1.0 

0.8 

0.6 

T3 

Q. 

0.4 

0.2 

0.0 

0.001  0.01  0.1  1.0  10  100  1000 

FA/km2 

FIGURE  A.  Example  of  a  Pd  versus  FA/km2  curve,  which  is  also  known  as  a 
receiver  operating-characteristic  curve,  for  a  multistage  target-detection 
algorithm.  The  additional  lines  represent  the  performance  of  the  discrimina¬ 
tion  stage  of  the  algorithm.  Three  of  these  performance  lines  are  shown;  in 
fact,  an  infinite  number  of  them  are  possible,  because  their  intersections 
with  the  prescreener  curve  are  dictated  by  the  level  at  which  the  prescreener 
stage  is  operated. 


tion  of  the  discrimination  performance  are  computed  to  hold  are  checked  in  most  cases.  These  checks  are 
in  each  case  from  a  target  region-of-interest  dataset  more  fully  detailed  in  the  section  entitled  “Confirm- 
and  a  clutter  talse-alarm  region-of-interest  dataset.  ing  the  Gaussian  Assumption.”  All  combinations  of 

Additionally,  the  assumptions  necessary  for  the  theory  the  discrimination  features  are  tested  by  using  the 


*2 


•  KREITHEN  ET  AL. 

Discriminating  targets  from  (flutter 


theory.  Theoretical  Pj  versus  P ^  plots  are  produced 
for  each  combination,  and  the  best  combinations  are 
chosen  tor  further  analysis.  The  best  combinations  of 
features  are  given  in  the  section  entitled  “Best  Fea¬ 
tures  tor  Discrimination.”  The  results  we  get  by  using 
the  real  data  are  then  generated  tor  the  short  list  of 
good  combinations.  The  real-data  results  are  then 
compared  with  the  theoretical  results;  these  compari¬ 
sons  are  given  in  the  section  entitled  “Real  Data  ver¬ 
sus  Theoretical  Performance.” 

Performance  evaluation  is  done  by  plotting  the 
probability  of  detection  Pd  versus  the  number  of  false 
alarms  per  square  kilometer  (FA/km").  The  measure 
of  false  alarms  per  square  kilometer  is  merely  a  rescaling 
of  the  probability  of  false  alarm  (P^)  into  an  opera¬ 
tionally  meaningful  measure.  This  rescaling  is  per¬ 
formed  to  remove  the  effect  of  sensor  resolution  (be¬ 
cause  a  higher  resolution  image  inherently  gives  more 
opportunities  for  false  alarms  to  occur,  the  same  Pja 
value  at  different  resolutions  means  different  num¬ 
bers  of  false  alarms  per  square  kilometer).  The  inter¬ 


pretation  of  plots  of  Pj  versus  FA/km"  is  reviewed  in 
the  sidebar  entitled  “Interpreting  Plots  of  Plt  versus 
FA/km".  ” 

Data  Used 

All  the  target  data  used  in  this  study  were  gathered 
with  the  Lincoln  Laboratory  MMW  SAR  in 
Stockbridge,  New  York.  The  targets  consisted  of  two 
datasets  of  the  same  targets  in  different  deployment 
conditions.  The  first  dataset,  which  we  use  for  dis¬ 
crimination  algorithm  training,  is  called  the  target¬ 
training  dataset.  The  second  dataset,  which  we  use  for 
discrimination  algorithm  testing,  is  called  the  target¬ 
testing  dataset.  There  are  three  distinct  clutter  datasets; 
two  gathered  at  Stockbridge,  New  York,  and  a  smaller 
clutter  dataset  gathered  in  Concord,  Massachusetts. 

The  first  clutter  dataset,  which  consists  of  mostly 
natural  clutter,  is  called  the  Stockbridge  natural-clut¬ 
ter  dataset.  Figure  13  is  an  example  of  this  dataset;  it 
shows  a  river  with  treelined  banks  (the  river  is  the 
dark  area  curving  through  the  middle  of  the  image). 


FIGURE  13.  SAR  image  of  natural  clutter  in  Stockbridge,  New  York.  The  sensor  is  flying  parallel  to  the  top  of 
the  image,  and  the  shadows  extend  downward  in  the  image.  Areas  of  high  radar  return  are  colored  in  bright 
yellow;  areas  of  low  radar  return  are  in  dark  colors.  The  dark  band  in  the  middle  of  the  image  is  a  river  with 
trees  lining  each  bank.  The  smooth  green  areas  are  open  fields. 


VOUiMF  fi  MlVBfR  '  TP'U  I  Hi  ilNnetf,  lABORMflRt  .!(lHR\Ai 


43 


•  KREITHEN  ET  AL. 

Discriminating  Jurgen  from  Clutter 


The  remainder  of  the  image  is  an  open  field.  Freshly 
plowed  furrows  in  the  open  field  can  also  be  seen.  I  he 
radar  illuminates  the  area  from  the  top  of  the  image; 
therefore,  the  shadows  cast  by  the  trees  point  down¬ 
ward  in  the  image.  The  Stockbridge  natural-clutter 
dataset  also  includes  some  man-made  objects  (which 
are  impossible  to  avoid  entirely  in  the  Stockbridge 
area;,  including  the  farmhouse  shown  in  an  earlier 
article  by  Novak  [1],  The  clutter  in  this  dataset  is 
considered  to  be  relatively  benign. 

The  second  clutter  dataset  is  called  the  Stockbridge 
natural-  and  cultural-clutter  dataset.  This  dataset  was 
gathered  from  a  different  area  of  the  same  Stockbridge 
collection  site;  it  includes  a  farm-supply  store  that  is 
shown  in  Figure  14  both  as  a  SAR  image  and  in  an 
aerial  photograph.  The  clutter  in  this  dataset  is  con¬ 
sidered  to  be  moderately  difficult. 

The  third  clutter  dataset  is  a  small  dataset  gathered 
in  Concord,  Massachusetts,  which  is  a  few  miles  from 
Lincoln  Laboratory.  This  dataset,  which  we  refer  to  as 
the  Concord  dataset,  consists  entirely  of  man-made 
clutter,  and  is  considered  to  be  a  very  difficult  dataset. 


Figure  15  shows  an  example  of  imagery  from  this 
dataset. 

Best  Features  for  Discrimination 

The  method  used  to  determine  the  best  features  for 
discrimination  is  fully  described  in  the  earlier  section 
entitled  “Choosing  a  Feature  Set.”  For  two  cases— the 
1-ft  resolution,  PWF  data  of  the  Stockbridge  natural- 
clutter  dataset  and  the  Stockbridge  natural-  and 
cultural-clutter  dataset — we  found  the  best  features 
to  be  those  given  in  Table  1.  For  the  case  of  the 
1-ft  resolution,  PWF,  Concord  man-made  clutter,  the 
feature  set  reduced  to  those  features  given  in  Table  2. 

As  stated  earlier,  we  did  not  attempt  to  pick  the 
best  features  for  the  1-ft  resolution,  HH-polarization 
case.  Instead  we  evaluated  performance  with  the  same 
features  as  the  best-case  features  for  the  PWF  data. 

For  the  1-m  resolution,  PWF,  Stockbridge  natural- 
clutter,  and  the  natural-  and  cultural-clutter  case,  we 
found  the  best  features  to  be  those  given  in  Table  3. 
The  “optional”  qualifier  given  in  the  table  means  that 
the  feature  does  not  increase  or  decrease  any  perfor- 


(a)  (b) 


FIGURE  14.  (a)  An  optical  photograph  and  (b)  a  SAR  image  of  a  farm-supply  store  in  Stockbridge,  New  York.  This 
store  is  an  example  of  a  man-made  clutter  discrete.  The  store  parking  lot  is  in  the  bottom  of  each  image.  Although  the 
photograph  and  the  SAR  image  were  taken  at  different  times,  passenger  cars  can  be  seen  in  the  parking  lot  in  both 
images.  The  bright  spots  in  the  middle  right  area  of  the  SAR  image  are  caused  by  various  metallic  objects  in  the  yard 
of  the  supply  store. 


44 


iHt  IdNCOlS  1A80RAM1R/  .HHjRNft!  VOLUVE  fi  M.lVBER  :  1993 


•  KREITHEN  El  At.. 

/  hscri  mi  nuting  /  r£<73  from  (  luiu  r 


(a) 


FIGURE  15.  (a)  A  SAR  image  of  man-made  clutter  in  Concord,  Massachusetts.  The  other  three  photographs 
illustrate  specific  objects  visible  in  the  SAR  image:  (b)  the  church  and  steeple,  (c)  a  spotlight  that  illuminates  the 
church  at  night,  and  (d)  a  house  and  a  telephone  wire  suspended  overhead.  Note  the  bright  columns  along  the 
side  of  the  church  in  the  SAR  image.  These  columns  clearly  correspond  in  number  and  placement  to  the  areas 
between  the  windows  of  the  church  in  the  optical  photograph.  Also  notice  the  bright  circular  feature — the  clock — 
on  the  church  steeple. 


4S 


•  KKHTtltN  L  I  A].. 

/  h't  nmiujiiii"  fvurti  (  inih  > 


Tablel.  Best  Features  for  1-ft,  PWF,  Natural-Clutter 
Dataset  and  Natural-  and  Cultural-Clutter  Dataset 

Feature  Description 

Fractal  dimension  Lincoln  Laboratory 

Weighted-rank  fill  ra+in  Lincoln  Laboratory 

Diameter  ERIM  size 

Mean  CFAR  or 

percent  bright  CFAR  ERIM  CFAR 

Percent  pure  ERIM  polarimetric 

Table  2.  Best  Features  for  1-ft,  PWF, 
Man-Made  Clutter  Dataset 

Feature  Description 

Fractal  dimension  Lincoln  Laboratory 

Percent  bright  CFAR  ERIM  CFAR 

Percent  pure  ERIM  polarimetric 

Table  3.  Best  Features  for  1-m,  PWF,  Natural-Clutter 
Dataset  and  Natural-  and  Cultural-Clutter  Dataset 

Feature  Description 

Fractal  dimension  Lincoln  Laboratory 

Diameter  ERIM  size 

Percent  bright  even  ERIM  polarimetric 

Percent  pure  ERIM  polarimetric 

Mean  CFAR  ERIM  CFAR  (optional) 

mance  ability  with  these  datasets,  but  it  could  add  or 
subtract  a  certain  amount  of  robustness  for  other 
datasets,  f  he  best  features  for  the  1-m  resolution, 
HH -polarization  case  are  given  in  I  able  4.  for  this 
case,  however,  the  discrimination  algorithm  provides 


T able  4.  Best  Features  for  1-m,  HH,  Natural-Clutter 
Dataset  and  Natural-  and  Cultural-Clutter  Dataset 

Feature  Description 

Fractal  dimension  Lincoln  Laboratory 

Diameter  ERIM  size 

Mean  CFAR  ERIM  CFAR  (optional) 

little  or  no  performance  gain  over  the  prescieener 
alone. 

Feature  Choice  Guidelines 

An  examination  of  the  list  of  best  features  from  the 
previous  section,  along  with  the  scatterplots  shown  in 
the  section  entitled  “Confirming  the  Gaussian  As¬ 
sumption,"  reveals  some  interesting  and  important 
guidelines  for  choosing  the  best  features.  There  are 
two  general  criteria  tor  feature  choice  tor  this  dis¬ 
crimination  algorithm — separation  and  orthogonality. 
The  separation  criterion  is  the  common-sense  consid¬ 
eration  that  the  feature  must  adequately  separate  the 
target  training  (and  target  testing)  dataset  from  the 
clutter  false-alarm  dataset.  The  orthogonality  crite- 
rion  is  less  intuitive,  and  can  be  summarized  by  the 
idea  that  different  features  used  in  the  discrimination 
algorithm  must  measure  different  attributes  of  the 
region  of  interest. 

Unfortunately,  we  cannot  easily  predict  exactly 
which  attribute  of  a  region  of  interest  a  feature  mea- 
sures.  Sometimes  two  features  that  beforehand  would 
seem  to  be  highly  correlated  ultimately  exhibit  a  low 
degree  of  correlation.  We  show  an  example  of  this 
type  of  behavior  later  in  this  section. 

I  he  best  features  listed  in  I  able  1  are  a  good  ex¬ 
ample  of  the  orthogonality  criterion.  We  see  that  the 
table  includes  two  of  the  three  Lincoln  Laboratory 
discrimination  features,  which  is  not  surprising  be¬ 
cause  the  three  Lincoln  Laboratory  features  were  de¬ 
signed  with  orthogonality  in  mind.  I  he  first  feature 
(fractal  dimension)  exploits  the  spatial  relationship  of 
the  top  /Vscatterers  in  the  region  of  interest,  while  the 
second  feature  (weighted-rank  fill  ratio)  exploits  the 


•  KREITHEN  1 1  AL. 

Discriminating  largeti  from  (  Inner 


distribution  of  reflected  power  among  all  the  scatter¬ 
ed  on  the  target.  Clearly,  these  two  features  were 
designed  to  measure  different  characteristics  of  the 
region  of  interest. 

The  three  other  features  included  as  best  features 
in  the  case  shown  in  (able  1  all  come  from  the  HRIM 
discrimination  features.  Interestingly,  the  three  cho¬ 
sen  features  each  come  from  a  different  subset  of 
leatures:  the  first  comes  from  the  hRIM  size  features, 
the  second  comes  from  the  HRIM  CHAR  features, 
and  the  third  comes  from  the  HRIM  polarimetric 
features.  Hven  if  the  HRIM  feature1'  were  not  designed 
with  the  orthogonality  criterion  in  mind,  we  find  it 
interesting  that  the  choice  of  best  features  naturally 
selects  one  feature  from  each  category. 

A  subset  of  the  features  listed  in  Table  1  works  best 
in  the  man-made  clutter  dataset,  as  shown  in  Table  2. 


The  orthogonality  criterion  holds  here  as  well,  except 
that  the  two  features  not  included,  which  were  in¬ 
cluded  in  Table  1,  no  longer  provide  reasonable  sepa¬ 
ration  between  targets  and  clutter  false  alarms. 

Hor  the  case  of  1-m  resolution,  there  is  an  apparent 
exception  to  the  two  criteria  given  above  in  the  best 
feature  choices.  Notice  that  lable  3  contains  two 
HRIM  polarimetric  features,  figure  16  shows  a 
scatterplot  of  these  two  features  (percent  bright  even 
and  percent  pure)  for  the  1-m  resolution  datasets, 
from  the  scatterplot  we  can  see  that  these  two  fea¬ 
tures  are,  in  fact,  uncorrelated  and  are  therefore  or¬ 
thogonal  in  some  meaningful  sense.  Apparently,  in 
the  1-m  resolution  dataset  the  thresholding  involved 
in  calculating  the  percent-bright-even  feature  causes 
this  feature  to  measure  something  other  than  the 
polarimetric  properties  of  the  region  of  interest.  The 


•  Stockbridge  natural-clutter  false  alarms  •  Target  testing 

•  Stockbridge  natural- and  cultural-clutter  false  alarms  •  Target  training 


FIGURE  16.  Scatterplot  of  percent  bright  even  feature  versus  percent  pure  feature  for  1-m-by-1-m  resolution  data. 
These  two  features  are  uncorrelated  because  the  data  points  do  not  fall  along  a  straight  line. 


47 


•  KKU  I  HKN  h  i  Ah. 

Discriminating  largely  from  (hitter 


five  features  chosen  in  fable  3  for  the  1-m  resolution 
case  obey  both  the  separation  and  orthogonality  crite¬ 
ria  given  above.  I  he  same  holds  true  for  the  best 
features  in  the  HH-polarization  case,  which  are  given 
in  Table  4. 


Real  Data  versus  Theoretical  Performance 

This  section  gives  the  real-data  prescreener  and  dis¬ 
crimination  results  in  the  form  of  plots  o!  versus 
FA/knT.  These  plots  are  explained  in  the  sidebar 
entitled  “Interpreting  Plots  of  /^versus  FA/knT.”  We 
also  plot  on  the  same  graphs  the  predictions  com¬ 
puted  from  the  theoretical  analysis  given  in  the  sec¬ 
tion  entitled  “  Theoretical  Analysis  of  the  One-Class 
Quadratic  Discrimination  Algorithm."  In  all  cases, 
the  theory  and  real  data  coincide  closely.  I  his  fact 
demonstrates  that  the  one-class  quadratic  discrimina¬ 
tion  algorithm  is  well  understood  as  it  is  implemented 
in  the  Lincoln  Laboratory  multistage  target-detection 
algorithm. 

figure  17  gives  the  combined  prescreener  and  dis¬ 
crimination  results  for  the  1-ft  resolution,  PWT  data 
tor  the  Srockbridge  nanral-clutter  dataset,  while  fig¬ 
ure  18  gives  the  prescreener  and  discrimination  re¬ 
sults  for  the  Stockbridge  natural-  and  cultural-clutter 
dataset,  and  figure  19  gives  the  prescreener  and  dis¬ 
crimination  results  for  the  Concord  man-made-clut¬ 
ter  dataset,  figures  20  and  21  show  the  prescreener 
and  discrimination  results  for  the  Stockbridge  natu¬ 
ral-clutter  dataset  and  the  Stockbridge  natural-  and 
cultural-clutter  dataset,  respectively,  both  results  are 
for  1-ft  resolution,  HH-polarization  data. 

The  remaining  results  are  for  the  1-m  resolution 
case,  figures  22  and  23  show  the  prescreener  and 
discrimination  results  for  the  Stockbridge  natural- 
clutter  dataset  and  the  Stockbridge  natural  and  cul¬ 
tural-clutter  dataset,  respectively,  for  PWF  data, 
figures  24  and  25  show  the  prescreener  and  discrimi¬ 
nation  results  for  the  same  two  datasets  for  the 
HH-polarization  case. 

Polarization  Comparisons 

We  can  compare  the  discrimination  results  from  the 
PWF  data  and  the  HH-polarization  data  for  the  same 
cases  to  draw  a  conclusion  regarding  the  advantage  of 
using  a  fully  polarimctric  radar  versus  using  the  more 


1.0 

0.8 


0.6 

*3 

A 

0.4 

0.2 

0.0 

0.001  0.01  0.1  1.0  10  100  1000 
FA/km^ 

FIGURE  17.  Comparison  of  real  data  and  theoretical  re¬ 
sults  for  the  1  -ft-by-1  -ft  resolution,  PWF,  Stockbridge  natu¬ 
ral-clutter  case. 

1.0 
0.8 


0.6 

■O 

a. 

0.4 

0.2 

0.0 

0.001  0.01  01  1.0  10  100  1000 
FA/knr 

FIGURE  18.  Comparison  of  real  data  and  theor  cal  re¬ 
sults  for  the  1  -ft-by-1  -ft  resolution,  PWF,  Stockbridge  natu¬ 
ral-  and  cultural-clutter  case. 

1.0 

0.8 


0.6 

■O 

A 

0.4 

0.2 

0.0 

0.001  0.01  0.1  1.0  10  100  1000 
FA/km* 

FIGURE  19.  Comparison  of  real  data  and  theoretical  re¬ 
sults  for  the  1  -ft-by-1  -ft  resolution,  PWF,  Concord  man¬ 
made  clutter  case. 


i  i  1 1  iiii| — i  i  1 1  nii| — i  1 1 1  im| — r~i  mill; — i  i  mill/ — i  i  miij 
- Discrimination 


Prescreener 

Theory 


y 


H 


i  i  i  mill  l  i  mini _ I  I  1  mill _ i  .11  mil  1  1  1  1  mil  I  1  in 


— rTTTTTTTp 

1  1  1  1 1TT1J  t  1  irTTIFj  1  |!f  irnj  1  flMlllj  1 

»  min 

— 

Discrimination  / 

-  - 

Prescreener  „ 

- 

- 

Theory  f 

/  1 

/  ». 

- 

- 

? 

j 

- 

/  / 

t 

J; 

/ : 

1  ;  1  uml 

lillUifl  fc  X  liiilli  .-X'  rrluiiii  1  i A 1 1  Lit 

i  nun 

Discrimination 

Prescreener 

Theory 


48 


•  KRCITHEN  ET  AL. 

Diwrimiudtmg  largely  from  Clutter 


ID 

Q, 


1.0 

— II  mill} - tTTTTTTT] - TTTTTTTTJ - 

-  Discrimination 

"TTTTTT»| - rTTTTTTT] — 

1  III  Ml 

0.8 

-  -  Prescreener 

- 

0.6 

- 

i 

- 

0.4 

- 

f 

j 

- 

0.2 

- 

J 

- 

.  1*—“ 

0.0 

_ ■  . . . . i  i  >■""! _ i  i  mini _ 

mml  i  i  ii  mil 

i  i  mill 

0.001  0.01  0.1  1.0  10  100  1000 
FA/km2 


1.0 

— ni  run; — r  rrmiq — i  i  mni] — r— 

n-nmj — i  hi nifj — 

i  mhii 

-  Discrimination 

0.8 

-  -  Prescreener 

- Theory 

,fl 

0.6 

- 

7  1 

- 

0.4 

V. 

_ 1 _ 

j 

1 

l 

i 

- 

0.2 

- 

! 

i 

i 

- 

0.0 

i  1  II  mil  "  t  II  mil  III  miil— r'' 

>  ii  nail _ 

i  mui 

0.001  0.01  0.1  1.0  10  100  1000 
FA/km2 


FIGURE  20.  Comparison  of  real  data  and  theoretical  re¬ 
sults  for  the  1  -ft-by-1  -ft  resolution,  HH-polarization, 
Stockbridge  natural-clutter  case. 


FIGURE  21.  Comparison  of  real  data  and  theoretical  re¬ 
sults  for  the  1 -ft-by-1 -ft  resolution,  HH-polarization, 
Stockbridge  natural-  and  cultural-clutter  case. 


FIGURE  22.  Comparison  of  real  data  and  theoretical  re¬ 
sults  for  the  1-m-by-1-m  resolution,  PWF,  Stockbridge 
natural-clutter  case. 


FIGURE  23.  Comparison  of  real  data  and  theoretical  re¬ 
sults  for  the  1-m-by-l-m  resolution,  PWF,  Stockbridge 
natural-  and  cultural-clutter  case. 


FIGURE  24.  Comparison  of  real  data  and  theoretical  re¬ 
sults  for  the  1-m-by-1-m  resolution,  HH-polarization, 
Stockbridge  natural-clutter  case. 


FA/km2 


FIGURE  25.  Comparison  of  real  data  and  theoretical  re¬ 
sults  for  the  1-m-by-1-m  resolution,  HH-polarization, 
Stockbridge  natural-  and  cultural-clutter  case. 


vniijvi  r,  w.'bs» 


mi  I'M.nn  iarorvor.  .'iijR'.a, 


49 


•  KRE1THEN  1 1  AL. 

DiiiriMindtiHg  l urge!)  from  (  luiltr 


common  HH  single-polarization  radar.  We  see  that 
the  performance  increase  is  significant  (a  reduction  of 
one  to  two  orders  of  magnitude  in  false-alarm  rate  for 
equal  probabilities  of  detection)  in  both  the  1  -ft  reso¬ 
lution  and  the  1-m  resolution  cases.  The  performance 
difference  is  more  pronounced  for  the  higher-resolu¬ 
tion  data. 

In  general,  during  the  course  of  this  study,  we 
noticed  that  the  combination  of  higher-resolution 
data  and  fully  polarimetric  data  provided  a  significant 
increase  in  performance.  Either  capability  alone  is  not 
nearly  as  effective  as  the  two  capabilities  together  for 
the  discrimination  features  we  have  studied  in  this 
article.  In  other  words,  building  radars  with  both 
higher  resolution  and  with  fully  polarimetric  capabil¬ 
ity  makes  sense. 

In  the  1-m  resolution  case,  the  difference  between 
the  PWF  data  and  the  HH-polarization  data  is  clear. 
Using  the  HH  data  alone,  the  discrimination  algo¬ 
rithm  provides  little  or  no  performance  improvement 
over  using  the  prescreener  algorithm  alone.  The  fea¬ 
tures  for  the  1-m,  HH-polarization  case  were  tuned 
specifically  for  these  datasets,  so  this  result  should  be 
considered  a  best  case.  Clearly,  there  is  no  point  in 
using  the  discrimination  algorithm  with  these  fea¬ 
tures  for  the  1-m  resolution,  HH-polarization  dataset, 
because  it  provides  little  benefit  and  it  requires  addi¬ 
tional  computational  capacity. 

Resolution  Comparisons 

We  can  also  compare  the  results  from  the  1  -ft  resolu¬ 
tion  case  with  the  results  from  the  1-m  resolution 
case.  We  see  that  the  higher-resolution  data  allows  a 
performance  increase  of  more  than  an  order  of  mag¬ 
nitude  in  terms  of  the  false-alarm  rate  for  a  given 
probability  of  detection.  This  performance  increase  is 
approximately  constant  over  the  different  cases  given 
in  Figures  17  to  25. 

Conclusion 

In  this  article,  we  discuss  and  evaluate  the  discrimina¬ 
tion  algorithm  used  in  the  Lincoln  Laboratory  multi¬ 
stage  target-detection  algorithm.  This  one-class  qua¬ 
dratic  discriminator  uses  features  calculated  from  SAR 
imagery.  The  discrimination  algorithm  uses  candidate 
regions  of  interest  identified  by  the  prescreener,  and 

;«•  ,'\cni\i*»owtor.  jtnwi  voni*.‘t  *  vivent  •  'vo 


ideally  el  iminates  all  natural-clutter  false  alarms 
from  further  consideration,  passing  onh  targets  and 
man-made  clutter  false  alarms  to  the  classification 
algorithm. 

Fifteen  discrimination  features  were  evaluated  for 
this  study;  three  of  the  features  were  developed  by 
Lincoln  Laboratory  and  the  remainder  were  devel¬ 
oped  by  the  three  STAR  contractors.  These  features 
were  modified  to  account  for  the  different  tvpes  of 
data  used  in  this  study,  and  the  best  set  of  features  was 
chosen  for  a  number  of  different  datasets  and  a  num¬ 
ber  of  different  types  ot  data.  The  best  features  re¬ 
mained  constant  from  the  natural-clutter  dataset  to 
the  natural-  and  cultural-clutter  dataset,  which  was  a 
surprising  and  pleasing  result.  For  best  performance, 
we  needed  to  select  different  feature  sets  tor  PWF  and 
HH-polarization  data,  as  well  as  for  1  -ft  and  1-m 
resolution  data,  which  was  not  a  surprising  result. 

We  evaluated  the  features  by  using  a  theoretical 
expression  that  accurately  predicted  the  real-data  per¬ 
formance  of  the  discrimination  algorithm.  This  accu¬ 
racy  reflects  a  good  understanding  of  how  the  dis¬ 
crimination  algorithm  functions  as  a  part  of  the 
Lincoln  Laboratory  multistage  target-detection  algo¬ 
rithm  for  SAR  data. 

Acknowledgments 

The  authors  would  like  to  thank  Dr.  Leslie  Novak  for 
the  help  and  encouragement  given  throughout  the 
course  of  this  study. 


50 


•  KREITHEN  ET  AE. 

Discriminating  Pargets  prom  C  Inner 


REFERENCES 

),  I.M.  Novak.  M.C  .  Burl.  R.D.  Chanev.  and  G.J.  Owirka, 
’Optimal  Processing  of  Polarimetric  Synthetic  Aperture  Ru 
Jar  Imagery,  line.  Lab.  J.  3,  2~*3  ( 1990). 

2.  VC'.W.  Irving,  G.J.  Owirka,  and  I..M.  Novak.  ‘Adaptive  Pro- 
eevsing  ol  POE-SAR  Imagery,”  Proc.  I  PEE  2a  th  Asiloniar 
Conf.  on  Circuits,  Systems,  und  i.omputers .  Pacific  Crave.  0*1, 
November  5— /90ft,  p.  388. 

3.  K.  Eukunaga,  R.R.  Hayes,  and  I..M.  Novak.  “  I  he  Acquisition 
Probability  for  a  Minimum  Distance  One-Class  Classifier,” 
IEEE  Trans.  Ac  rasp,  electron.  Syst.  23.  493  { 1987). 

4.  E  M.  Novak,  On  the  Sensitiviiv  ot  Bayes  and  Fisher  Classifi¬ 
ers  in  Radar  'Parget  Detection,”  Proc.  IEEE  IXth  Asilomtr 
(  onf.  on  (.ircuits,  Systems,  anti  Computers.  Pacific  drove,  CA , 
November  5-  7  /  9Pa,  p.  367. 

3,  D.F..  Kreithen  and  S.D.  Halversen,  “A  I  heorctical  An.ilvsis  of 

l  Ranking  Discrimination  Algorithm,”  Proc.  IEEE  26th  Asilo- 
mar  Conf.  on  (area its.  Systems,  and  Computers,  Pacific  drove, 
CA,  October  2(>~2H,  1992,  p.  431. 

(>.  M.C.  Burl,  Ci  J.  Owirka,  and  L.M.  Novak,  "Texture  Discrim¬ 
ination  in  Synthetic  Aperture  Radar  Imager),"  Proc.  IEEE 
2.  hd  As  ilo  mar  Conf.  on  Circuits,  Systems,  and  C.omputen,  Pacif¬ 
ic  (iroiv,  CA,  October  AO-Xoreutlier  l,  !nRn  p  399 . 

7-  S.l).  I  lalversen,  “Calculating  the  Orientation  ot  a  Rectangular 

Target  in  SAR  Imagery,”  Proc.  H  UP  1 992  National  Aerospace 
and  Ideetronies  Conf.  (NAPCON  V2),  Dayton,  Ohio,  IR-22 
May  PPM,  p.  260.  ' 

8.  N.l  .  Johnson  and  S.  Kot/,  Continuous  Univariate  Distribu¬ 
tions  2  ( Wilcv.  New  York,  1070). 

0.  R.R.  Parent i  and  LAV.  Tung,  "A  Statistical  Analysis  ot  the 
Multiple  I  arget  Multiple  Shot  Target  Acquisition  Problem,” 
Project  Report  7  7-4.3,  MI  I  Lincoln  l  aboratory,  28  Jan.  1981. 

10.  Ci.B.  Cioldstein,  "l  alse-Alarm  Regulation  in  Log-Normal  and 
Weibull  C. Inner,  Il:Ph  Pram.  Aerosp.  Electron.  Syst.  9,  84 
(1973). 


WOtUVf  6  HUVBtR  '  '<m  Utl  IIHCOLS  USOSMO#*  JOtlRNU 


51 


•  KRH1I1KN  H  Al . 

Ihn  rnmuttiing  l ,tiyeh  h nut  (  Inner 


DANIEL  E.  KREITHEN 

is  .1  stall  member  in  ills'  Survcil- 
lance  Systems  group.  His 
research  speciality  is  in  the 
detection  of  stationary  targets 
in  synthetic-aperture  radar 
imagers'.  He  received  a  Sc.B. 
degree  in  engineering  from 
Brown  University,  and  an 
degree  in  electrical 
engineering  trom  Princeton 
I'niversity.  Han  has  been  at 
Lincoln  laboratory  since  l‘)S8. 


SHAWN  1).  HAl.VERSEN 

is  an  associate  stall  member  in 
the  SurscilLncc  Systems  group. 
His  research  speciality  is  in  the 
detection  and  discrimination  ol 
stationary  targets.  He  received  a 
B.A.  degree  in  mathematics  and 
a  B.S.  degree  in  physics  Irom 
the  I'niversity  ol  West  Florida, 
and  an  M.A.  degree  in  math¬ 
ematics  trom  the  University  of 
Wisconsin.  I  le  has  been  at 
Lincoln  Laboratory  since  UWO. 


GREGORY  J.  OW  IRKA 
is  an  assistant  stall  member  in 
the  Surveillance  Systems  group. 
I  le  received  a  B.S.  degree  (cum 
laude)  in  applied  mathematics 
Irom  Southeastern  Massachu¬ 
setts  University,  and  he  is 
currently  working  on  an  M  S. 
degree  in  electrical  engineering 
at  Northeastern  University, 
dregs  current  research  interests 
are  in  automatic  target  recogni¬ 
tion.  He  has  been  at  Lincoln 
I  aboraton  since  I V8A 


Improving  a  Template-Based 
Classifier  in  a  SAR  Automatic 
Target  Recognition  System  by 
Using  3-D  Target  Information 

Shawn  M.  Verbout,  William  W.  Irving,  and  Amanda  S.  Hanes 

I  In  this  article  we  propose  an  improved  version  of  a  conventional  template- 
matching  classifier  that  is  currently  used  in  an  operational  automatic  target 
recognition  system  for  synthetic-aperture  radar  (SAR)  imagery.  This  classifier 
was  originally  designed  to  maintain,  for  each  target  type  of  interest,  a  library  of 

2- D  reference  images  (or  templates)  formed  at  a  variety  of  radar  viewing 
directions.  The  classifier  accepts  an  input  image  of  a  target  of  unknown  type, 
correlates  this  image  with  a  reference  template  selected  (by  matching  radar 
viewing  direction)  from  each  target  library,  and  then  classifies  this  image  to  the 
target  category  with  the  highest  correlation  score.  Although  this  algorithm 
seems  reasonable,  it  produces  surprisingly  poor  classification  results  for  some 
target  types  because  of  differences  in  SAR  geometry  between  the  input  image 
and  the  best-matching  reference  image.  Each  reference  library  is  indexed  solely 
by  radar  viewing  direction,  and  is  thus  unable  to  account  for  radar  motion 
direction,  which  is  an  equally  important  parameter  in  specifying  SAR  imaging 
geometry.  We  correct  this  deficiency  by  incorporating  a  model-based  reference 
generation  procedure  into  the  original  classifier.  The  modification  is 
implemented  by  (1)  replacing  each  library  of  2-D  templates  with  a  library  of 

3- D  templates  representing  complete  3-D  radar-reflectivity  models  for  the  target 
at  each  radar  viewing  direction,  and  (2)  including  a  mathematical  model  of  the 
SAR  imaging  process  so  that  any  3-D  template  can  be  transformed  into  a  2-D 
image  corresponding  to  the  appropriate  radar  motion  direction  before  the 
correlation  operation  is  performed.  We  demonstrate  experimentally  that  the 
proposed  classifier  is  a  promising  alternative  to  the  conventional  classifier. 

An  automatic  target  recognition  (ATR)  classifier.  The  function  of  the  classifier  is  to  take  input 

system  is  an  integrated  collection  of  algo-  measurements  that  represent  detected  targets  and  cat- 

rithms  designed  to  process  sensor  measure-  egorize  these  inputs  according  to  target  type.  I  lie 

ments  so  that  targets  can  be  efficiently  detected  and  classifier  is  designed  with  the  assumption  that  each 

identified.  The  algorithms  that  comprise  an  ATR  sys-  input  belongs  to  one  and  only  one  category  from  a 

tern  are  applied  on  a  computer  and  are  organized  so  predetermined  set  (e.g.,  tank ,  truck ,  gun),  and  that  the 

that  human  intervention  is  not  required.  input  has  certain  observable  characteristics  that  aid  in 

An  important  component  of  any  ATR  system  is  its  its  assignment  to  this  category.  The  classifier  output 


53 


•  VI  KHOU  I  r  I  Al . 

hufnorin^  a  1  etupLne  Htbed  l  Lbefier  m  ,/  VIA*  Auuntnilu  i  Reen^nilwn  \\>irw  b\  l  un£  >  l)  r  hiRnnijiitiu 


Input 

image 


Rejects  imagery  without  Rejects  natural-clutter  Rejects  man-made 
potential  targets  false  alarms  clutter 


Classifies 

targets 


FIGURE  1.  Block  diagram  of  the  three-stage  SAR  automatic  target  recognition  system  developed  by  Novak.  The  input 
consists  of  SAR  imagery  representing  many  square  kilometers  of  terrain  and  potentially  containing  several  targets  of 
interest;  the  output  consists  of  locations  and  classification  labels  for  these  targets.  This  article  proposes  an  improved 
version  of  the  classifier  stage. 


corresponding  to  each  input  is  an  estimate  of  t lie 
correct  category  label,  based  on  the  observable  char- 
acteristics  of  the  input. 

Of  course,  the  issues  involved  in  building  a  classi¬ 
fier  varv  according  to  the  kinds  of  sensor  measure¬ 
ments  being  processed.  In  this  article,  we  are  con¬ 
cerned  exclusively  with  a  classifier  whose  inputs  are 
2-1 )  synthetic-aperture  radar  (SAR)  images.  In  par¬ 
ticular,  we  analyze  and  suggest  extensions  to  the  struc¬ 
ture  of  the  classifier  in  the  SAR  AI  R  system  devel¬ 
oped  bv  Leslie  M.  Novak  (see  the  article  entitled 
"Performance  of  a  1  ligh-Resoluiion  Polarimctric  SAR 
Automatic  Tartiet  Recognition  System”  in  this  issue). 
I  bis  ground-based  system  has  been  designed  to  oper¬ 
ate  in  an  off-line,  experimental  setting.  It  has  been 
rigorously  tested  over  the  past  five  years,  and  it  is  one 
of  the  first  systems  of  its  kind  to  process  large  quanti¬ 
ties  of  actual  SAR  data. 

Novak's  SAR  A  I  R  system  is  conveniently  decom¬ 
posed  into  a  sequence  of  three  processors;  a  detector 
(or  presereener),  a  discriminator,  and  a  classifier  (see 
Figure  I ).  The  detector  searches  through  imagery  rep¬ 
resenting  many  square  kilometers  of  terrain,  and  out¬ 
puts  a  collection  of  regions  of  interest  centered  at 
possible  target  locations.  (hash  region  of  interest  is  a 
.subimage  extracted  from  the  original  SAR  dataset; 
collectively,  all  of  the  regions  of  interest  comprise 
only  a  small  fraction  of  this  original  dataset.)  I  he 
discriminator  applies  further  processing  to  distinguish 
between  two  kinds  of  regions  of  interest;  those  con¬ 
taining  man-made  objects  (i.c.,  either  targets  or  man¬ 
made  clutter)  and  those  containing  natural  clutter. 


All  regions  of  interest  that  appear  to  contain  natural 
clutter  are  discarded.  Finally,  the  classifier  assigns  each 
remaining  region  of  interest  to  a  predefined  target 
category,  or  to  a  none-of-the-above  category  if  the  re¬ 
gion  of  interest  appears  to  contain  man-made  clutter. 

Although  Novaks  A  I  R  system  is  usually  applied  to 
the  multiclass  target  identification  problem  (i.c.,  the 
problem  in  which  two  or  more  kinds  of  targets  must 
be  distinguished),  for  convenience  we  consider  only 
the  one-class  problem  in  this  article.  In  the  one-class 
problem  onlv  one  kind  of  target  is  of  interest;  there¬ 
fore,  the  classifier  output  reduces  to  a  simple  yes/no 
decision  that  indicates  whether  a  target  of  this  kind  is 
present  in  the  region  of  interest.  Figure  2  illustrates 
the  input-output  operation  of  a  one-class  classifier. 

Novak’s  classifier,  which  we  refer  to  as  the  baseline 
classifier,  uses  a  conventional  template-matching  al- 


Target 

absent 


FIGURE  2.  Illustration  of  the  input-output  operation  of  the 
baseline  one-class  classifier.  The  input  is  a  subimage,  or 
region  of  interest,  extracted  from  the  original  SAR  dataset; 
the  output  is  a  decision  indicating  whether  a  target  of 
interest  is  present. 


•  VERBOUT  E T  AL. 

Improving  a  Template- Based  Classifier  in  a  SAR  Automatic  large!  Recognition  System  by  Using  .UL>  Target  Information 


gorithm.  The  one-class  version  of  this  classifier  is 
implemented  in  the  following  way-  For  a  particular 
target  of  interest,  the  classifier  has  a  database  of  stored 
reference  images,  each  formed  by  using  a  different 
radar  viewing  direction.  The  reference  image  whose 
associated  radar  viewing  direction  best  approximates 
that  of  the  incoming  test  image  (i.e. ,  the  incoming 
region  of  interest)  is  called  up  from  the  database  and 
is  correlated  with  the  test  image  to  generate  a  match 
score.  If  this  score  exceeds  a  predetermined  threshold, 
the  classifier  declares  that  a  target  of  interest  is  present. 

The  template-matching  algorithm  is  attractive  be¬ 
cause  it  is  readily  implemented  on  a  computer  and  it 
has  an  intuitively  pleasing  structure.  For  a  database 
formed  by  using  a  typical  imaging  configuration,  how¬ 
ever,  the  classifier  produces  poor  results  for  some 
target  types.  This  fact  is  not  surprising,  because  the 
system  was  originally  designed  to  process  images 
formed  with  a  fixed  SAR  geometry,  whereas  in  the 
most  commonly  used  imaging  configurations  the  SAR 
geometry  is  continually  changing.  In  this  article,  we 
seek  to  generalize  the  structure  of  the  classifier  to 
account  for  variability  in  SAR  geometry. 

SAR  geometry  can  be  characterized  as  a  function 
of  two  parameters — radar  viewing  direction  and  ra¬ 
dar  motion  direction.  The  baseline  classifier  does  not 
account  for  radar  motion  direction,  however,  and  is 
therefore  equipped  with  an  incomplete  set  of  refer¬ 
ence  images  in  its  database.  The  baseline  classifier  was 
designed  with  the  assumption  that  the  radar  viewing 
direction  is  the  only  parameter  that  can  be  varied  to 
produce  different  images  of  a  target. 

In  reality,  the  direction  of  radar  motion  is  an  equally 
important  parameter  in  defining  the  SAR  imaging 
geometry,  which  implies  that  two  images  formed  with 
the  same  radar  viewing  direction,  but  with  different 
radar  motion  directions,  will  look  different.  Even 
though  the  same  physical  target  scatterers  are  illumi¬ 
nated  in  both  cases,  the  3-D  scatterer  positions  be¬ 
come  mapped  to  two  different  2-D  SAR  image  loca¬ 
tions.  Because  the  baseline  classifier  ignores  the 
direction-of-motion  parameter,  it  often  correlates  a 
test  image  and  a  reference  image  that  are  formed  with 
different  SAR  imaging  geometries.  These  differences 
in  imaging  geometry  cause  the  test  image  and  refer¬ 
ence  image  to  have  dissimilar  characteristics,  and  con¬ 


sequently  to  have  a  low  correlation  score. 

An  immediate  solution  to  this  problem  is  to  in¬ 
clude  in  the  classifier  database  additional  reference 
images  formed  by  using  the  SAR  geometries  that  are 
not  currently  represented.  This  solution  is  undesir 
able,  however,  because  it  would  require  a  costly  data 
collection,  and  it  would  also  increase  the  storage  re¬ 
quirements  for  the  database  by  roughly  an  order  of 
magnitude. 

In  this  article  we  describe  a  more  elegant  solution 
for  improving  classifier  performance  than  the  mere 
tenfold  augmentation  of  the  reference  set  described 
above.  This  new  solution,  which  maintains  the  tradi¬ 
tional  template-matching  engine,  calls  for  two  major 
modifications  to  the  baseline  classifier:  (1)  the  re¬ 
placement  of  the  present  set  of  2-D  reference  images 
with  a  set  of  3-D  templates,  and  (2)  the  incorporation 
of  a  mathematical  model  of  the  SAR  imaging  process 
so  that  any  3-D  template  can  be  appropriately  trans¬ 
formed  to  synthesize  a  -D  reference  image  for  the 
correlation  operation.  Later  in  this  article  we  describe 
a  novel  method  for  creating  3-D  templates  from  cur¬ 
rently  existing  2-D  target  images. 

The  body  of  the  article  is  divided  into  three  major 
sectit  ns.  In  the  first  section  we  describe  in  detail  how 
the  baseline  classifier  works.  In  the  second  section  we 
demonstrate  the  problem  with  this  classifier  and  ex¬ 
plain  why  this  problem  exists.  In  the  third  section  we 
describe  specifically  how  we  can  modify  the  baseline 
classifier  to  improve  its  overall  performance.  Finally, 
we  summarize  the  key  points  of  the  article  and  sug¬ 
gest  directions  for  future  work  toward  improving  clas¬ 
sification  performance  in  a  SAR  ATR  system. 

How  the  Baseline  Classifier  Works 

The  algorithm  used  by  the  baseline  classifier  is  de¬ 
scribed  schematically  in  Figure  3.  As  shown  in  this 
figure,  the  input  to  the  classifier  consists  of  two  com¬ 
ponents.  The  first  input  component  is  a  2-D  test 
image  representing  a  region  of  interest  from  the  origi¬ 
nal  SAR  dataset.  As  mentioned  above,  this  image  has 
passed  through  the  first  two  stages  of  the  ATR  system 
(i.e.,  the  detection  and  discrimination  stages)  and 
thus  contains  an  object  that  appears  sufficiently 
targetlike  to  be  considered  for  classification.  The  sec¬ 
ond  input  component  is  a  pair  of  angle  values  that 


TO  I1VS  f  VJVStR  ’  "i'll  1  H  F  LtBORAlORV  JOilKNAl 


55 


•  VTKHOl'  I  r  I  \l  . 

/wfiiinn^  ,1  / 1  mpltUt  liii'ii/  <  ;  m  ,t  VIA' .  IaMiWi  l.n^t Hinnoiiii'i'i  Si  ,i,nth\  t'-i,i»  >  /  >  l.n^ii  /  />< 


Test  input 


Target 

piesent 


..Target 

absent 


FIGURE  3.  Schematic  description  of  the  algorithm  used  by  the  baseline  one-class  classifier.  The  classifier  uses  the  aspect 
angle  «  and  the  depression  angle  II  to  select  a  2-D  reference  image  from  the  database.  This  reference  image  is  then 
correlated  with  the  input  test  image;  if  the  correlation  score  p  is  greater  than  or  equal  to  the  threshold  r,  the  target  present 
decision  is  declared. 


define  the  radar  viewing  direction  with  respect  to  the 
imaged  object.  I  hese  values  are  estimates  of  the  angles 
u  (tlie  itsptrt)  and  I)  (the  depression ),  which  are  defined 
pictorially  in  hignre  4. 

Because  the  database  is  convenientlv  indexed  ac¬ 
cording  to  these  two  radar  viewing  angles,  the  classi¬ 
fier  can  readily  select  the  reference  image  whose  as¬ 
pect  and  depression  are  closest  to  the  input  estimates 
of  a  and  0  computed  for  the  test  image.  Once  the 
appropriate  reference  image  is  selected,  it  is  scaled  so 
that  the  sum  of  the  squares  of  its  pixel  values  is  equal 
to  units  :  the  test  image  is  also  scaled  in  this  wav.  Next, 
the  normalized  test  and  reference  images  are  corre¬ 
lated  to  yield  a  correlation  score  f>  whose  value  is 
between  0  and  1 . 

I  his  correlation  operation  is  mathematically  de¬ 
fines!  in  the  following  wav.  l  et  us  assume  that  the  test 
and  reference  images  are  equal  in  size,  each  having  M 
cells  in  the  range  dimension  and  A’ cells  in  the  cross¬ 
range  dimension.  l  et  the  function  /(-,■)  be  defined 


for  integer  values  of  its  arguments  such  that  Knur)  is 
equal  to  the  amplitude  of  the  test  image  at  the  range 
and  cross-range  location  (111,11)  for  I  <  m  <  M  and 
1  <  //  <  N,  and  is  equal  to  zero  for  all  other  values  of  w 
and  n.  l  et  the  function  K(  • ,  • )  be  defined  analo¬ 
gously  with  respect  to  the  reference  image.  Then  the 
correlation  score  />  for  the  two  images  is  defined  by 


/>  =  max  j  -  /  (111,11)  R(i  +  in,  j  +  //)  j  , 


where  s  is  the  overall  normalization  factor  given  bv 


\  w  =  \  u=\  \  n/-]  n  =  \ 


As  shown  in  l  igure  3,  the  classifier  declares  that  a 
target  is  present  in  the  test  image  onlv  it  p  is  greater 
than  or  equal  to  the  preselected  threshold  r. 


•  VIRBOUT  ET  AL. 

Improving  a  h  mplite-Based  Classifier  in  a  SAR  Automatic  Target  Recognition  System  by  Using  3-D  Target  Information 


FIGURE  4.  Pictorial  definition  of  the  aspect  angle  a  and 
the  depression  angle  t).  These  angles  specify  the  radar 
viewing  direction  with  respect  to  the  imaged  object. 

We  can  more  clearly  understand  the  fundamental 
problem  with  the  baseline  classifier  by  analyzing  how 
the  target  reference  images  are  generated  for  the  clas¬ 
sifier  database.  Each  reference  image  is  formed  from 
data  collected  by  the  Lincoln  Laboratory  millimeter- 
wave  airborne  radar  [1],  Once  a  target  of  interest  is 
deployed  in  an  open  area,  the  data  are  collected  by 
using  a  special  mode  of  the  radar  known  as  spotlight 
mode,  which  is  illustrated  in  Figure  5.  In  this  mode, 


FIGURE 5.  Imaging  configuration  for  spotlight-mode  SAR. 
In  this  mode  the  airplane  moves  in  a  straight  line  at  a 
constant  altitude,  while  the  antenna  is  steered  continu¬ 
ously  so  that  it  always  points  at  a  fixed  patch  of  terrain. 


the  airplane  flies  in  a  straight  line  at  constant  altitude, 
and  the  radar  antenna  is  steered  continuously  so  that 
it  always  points  in  the  direction  of  the  target. 

With  the  radar  beam  illuminating  the  target  like  a 
spotlight  throughout  the  flight,  a  new  image  of  the 
target  can  be  formed  approximately  every  degree  of 
azimuth.  Each  new  image  can  be  used  as  a  reference 
for  the  classifier  database.  A  set  of  reference  images 
representing  360°  of  aspect  coverage  is  created  bv 
flying  four  such  linear  paths  ro  view  the  target  from 
all  sides,  as  shown  in  Figure  6. 

Although  spotlight  mode  is  not  the  only  mode  that 
can  be  used  to  generate  reference  images,  it  is  the 
most  convenient  and  efficient  mode  for  imaging  a 
target  at  a  variety  of  radar  viewing  angles.  To  see  why 
this  statement  is  true,  consider  the  database  of  spot¬ 
light  imagery  that  can  be  generated  by  flying  the  basic 
pattern  shown  in  Figure  6  at  a  sequence  of  increasing 
altitudes.  Clearly,  if  the  difference  between  successive 
flight-path  altitudes  is  small  enough,  then  the  data¬ 
base  will  contain  a  representative  image  of  the  target 
that  is  close  to  any  desired  aspect-angle  and  depres¬ 
sion-angle  pair.  Moreover,  this  complete  coverage  is 
obtained  without  ever  having  to  move  the  target.  In 
spite  of  the  many  advantages  to  using  this  kind  of 
data-collection  procedure,  there  is  a  serious  deficiency 
associated  with  it.  This  deficiency  is  analyzed  in  detail 
in  the  next  section. 

Why  the  Baseline  Classifier  Needs  Improvement 

Now  that  we  have  discussed  the  method  used  to 
generate  target  reference  images  for  the  database,  we 
are  better  equipped  to  analyze  why  the  baseline  classi¬ 
fier  can  make  a  gross  error  in  categorizing  an  input 
test  image.  In  this  section  we  explain  how  such 
misciassifications  occur,  even  though  the  database  is 
densely  populated  with  target  reference  images  from 
all  desired  radar  viewing  directions. 

We  begin  by  using  Figure  7  to  demonstrate  what  is 
wrong  with  the  baseline  classifier.  Figure  7(a)  shows 
an  optical  photograph  of  an  M48  tank,  and  Figures 
7(b),  7(c),  and  7(d)  show  three  simulated  SAR  images 
of  the  tank.  The  SAR  images  are  color  coded  with  a 
scale  that  makes  a  gradual  transition  from  black  (low 
intensity)  to  green  (medium  intensity)  to  white  (high 
intensity).  In  each  SAR  image,  the  front  part  of  the 

VPiliVf  !.  VdVR'R  '  ''Vi-  l‘-f  |lM'.fi(\  iAgWH'dRi  .  in;R\i; 


57 


•  VERBOUT  E  l  A L. 

Improving  a  (  emplate- Based  (.Ltssifter  in  a  SAR  Automatic  Target  Recognition  System  by  Using  i-l)  Target  Information 


tank  is  pointing  toward  the  upper  left  corner  of  the  test  image  could  be  erroneously  labeled  as  containing 

image.  All  three  SAR  images  shown  in  this  figure  were  no  target. 

formed  by  using  the  same  aspect  angle  and  depression  The  only  difference  between  the  SAR  imaging  con- 

angle.  In  other  words,  the  same  scatterers  on  the  figurations  used  to  generate  the  images  in  Figures 

target  were  illuminated  by  the  radar  from  the  same  7{b),  7(c),  and  7(d)  was  the  direction  in  which  the 

viewing  direction  for  each  image.  Note,  however,  that  radar  was  moving  with  respect  to  the  viewing  direc- 

the  images  look  dramatically  different.  This  phenom-  tion.  This  change  alone  is  sufficient  to  yield  SAR 

enon  contradicts  the  key  design  assumption  that  fix-  images  that  look  quite  different,  and  yet  the  direc- 

ing  the  radar  viewing  direction  uniquely  specifies  the  tion-of-motion  parameter  has  been  completely  ig- 

SAR  image  of  the  target.  nored  in  the  design  of  the  baseline  classifier. 

To  understand  how  the  existence  of  three  such  To  see  how  this  parameter  directly  affects  the  ap- 

images  affects  the  classifier,  let  us  assume  that  Figure  pearance  of  a  SAR  image,  we  devore  much  of  this 

7(b)  is  a  stored  reference  image  and  that  Figure  7(d)  is  section  to  the  description  and  application  of  a  widely 

an  incoming  test  image.  Because  the  two  images  were  used  mathematical  model  of  the  SAR  imaging  pro- 

formed  by  using  exactly  the  same  radar  viewing  direc-  cess.  In  particular,  we  model  the  SAR  transformation 

tions,  the  image  in  Figure  7(b)  would  be  chosen  as  the  as  a  projection  of  the  3-D  distribution  of  target  scat- 

reference  image  most  likely  to  match  the  test  image.  terers  onto  a  2-D  image  plane,  and  we  demonstrate 

But  because  the  two  images  are  so  dissimilar,  their  the  usefulness  of  this  model  by  a  simple  example, 

correlation  score  would  be  low,  and  consequently  the  The  projection  model  is  conceptually  important 


FIGURE  6.  Top  view  of  the  flight  path  used  to  create  a  set  of  target  reference  images  representing  360°  of  aspect  coverage. 
A  sequence  of  the  generated  reference  images  is  shown  notionally  at  right. 

58  :<if  i  \  i  AROM'IJR'-  ;/f!>jVif  M.'VBF**  i  *!fM 


FIGURE  7.  (a)  Photograph  of  an  M48  tank.  The  computer-generated  images  in  (b),  (c),  and  (d)  are  simulated  SAR  images  of 
an  M48tank,  all  created  with  the  same  aspect  and  depression  angles  but  with  different  squint  angles  </>,  where  </>  specifies  the 
direction  in  which  the  radar  is  moving  with  respect  to  the  viewing  direction.  The  SAR  images  are  color  coded  with  a  scale 
that  makes  a  gradual  transition  from  black  (low  intensity)  to  green  (medium  intensity)  to  white  (high  intensity).  In  each  SAR 
image,  the  front  part  of  the  tank  is  pointing  toward  the  upper  left  corner  of  the  image.  (The  SAR  images  were  generated  with 
the  SARTOOL  signature  simulation  software  developed  by  The  Analytical  Sciences  Corporation.) 


for  the  remainder  of  the  article,  particularly  in  the 
t'mal  section  in  which  we  incorporate  this  model  into 
an  improved  version  of  the  baseline  classifier.  We  now 
prep  Me  to  introduce  the  projection  model  with  some 
fundamental  definitions  associated  with  the  SAR  im¬ 
aging  process. 

Deseri/uion  of  SAR  Imaging  (Icamctry 

f  igure  S  illustrates  the  basic  elements  that  define  the 
spotlight  SAR  imaging  gcometrv.  In  this  figure,  we 
see  the  airborne  radar  as  it  moves  in  a  straight  line- 
while  its  antenna  is  steered  to  illuminate  a  fixed  ground 


location  known  as  the  aimpoint.  We  have  imposed  a 
mathematical  structure  on  this  geometry  by  using  a 
Cartesian  coordinate  svstem  (known  as  the  world  co¬ 
ordinate  system)  whose  origin  coincides  with  the 
aimpoint,  and  whose  coordinate  locations  (  v,  y,  z)  arc- 
measured  in  terms  of  the  unit  basis  vectors  x,  y,  and 
z  (shown  m  blue).  In  this  svstem.  we  use  the  conven¬ 
tion  that  v  points  in  the  direction  of  radar  motion. 
Also,  note  that  in  the  vicinitv  ot  the  aimpoint  we 
model  the  local  earth  surface  as  a  ground  plane  defined 
In  the  equation  e  -  0. 

I  lie  line  that  passes  through  the  radar  position  and 


S‘> 


•  VERBOUT  ET  AL. 

Improving  a  i empLite-Haseel  (.bissifier  in  a  SAR  Automata  target  Recognition  System  by  Vying  A- 1^  target  Information 


this  justification  and  merely  provide  a  concise  math¬ 
ematical  description  of  our  SAR  imaging  model,  fol¬ 
lowing  this  description,  we  give  an  application  of  our 
SAR  imaging  model  in  the  form  of  a  simple  visual 
example. 

A  Mathematical  Model  for  SAR  Imaging 

We  begin  our  description  of  the  SAR  transformation 
by  introducing  the  radar  coordinate  system  (shown  in 
red)  in  Figure  8.  The  origin  of  this  Cartesian  coordi¬ 
nate  system  coincides  with  the  aimpoint,  and  the 
coordinate  locations  are  represented  in  terms  of  the 
unit  basis  vectors  r  (the  range  vector),  c  (the  cross- 
range  vector),  and  n  (the  slant-plane  normal  vector). 
This  coordinate  system  can  be  defined  in  terms  of  the 
basic  elements  of  the  spotlight  SAR  geometry  defined 
above. 

We  begin  with  the  observation  that  the  range  vec¬ 
tor  r,  which  points  in  the  direction  of  the  radar  line 
of  sight,  can  be  expressed  in  world  coordinates  as 

cos  <p  cos  0 
r  =  sin  0  cos  0  . 

-  sin  0 

To  check  that  this  expression  is  correct,  the  reader  can 
easily  verify  the  following  three  properties  of  r:  (1)  r 
is  unit  length,  (2)  the  projection  of  r  onto  the  .v-y 
plane  is  rotated  counterclockwise  by  the  angle  0  with 
respect  to  the  .v-axis,  and  (3)  r  is  tilted  downward  by 
the  angle  0  with  respect  to  the  x-y  plane. 

By  using  the  vector  r  and  the  world-coordinate 
basis  vector  y  =  [  0  1  0  ] /,  both  of  which  lie  in  the 
slant  plane,  we  can  construct  the  slant-plane  normal 
vector  n  with  the  cross-product  formula 


r  x  y 

n  =  t, - —r,  -  k 


Because  the  cross-range  vector  c  must  t>e  perpendicu¬ 
lar  to  both  r  and  n,  it  is  constructed  bv  using  the 
formula 

—  sin  0  cos  0  cos'  0 

.  .  .  .  ’  i  .i 

c  =  n  X  r  =  k  cos' 0 cos' +  sin  0  . 

sin  0  cos  0  sin  0 


From  this  coordinate-system  construction,  we  see  that 
the  vectors  r  and  c  form  an  orthonormal  basis  for  the 
slant  plane,  just  as  the  vectors  x  and  y  form  an 
orthonormal  basis  for  the  ground  plane. 

Now  we  consider  imaging  a  point  reflector  at  loca¬ 
tion  P  =  l  px  pv  p .  ] '  in  the  world  coordinate  system. 
We  can  express  this  point  in  the  radar  coordinate 
system  by  using  the  standard  dot  product  to  project 
the  point  p  onto  each  of  the  unit  basis  vectors  r,  c, 
and  n.  The  resulting  vector  q  can  be  written  in  radar 
coordinates  as 


q  =  '/  =  p  ■ c 


According  to  our  basic  model  for  the  SAR  transfor¬ 
mation,  we  must  now  project  the  3-D  vector  q  onto 
the  2-D  slant  plane  to  obtain  its  location  in  the  SAR 
image.  We  can  do  this  projection  by  retaining  the  first 
two  components  of  q  and  neglecting  the  third  com¬ 
ponent,  because  the  entire  third  dimension  of  the 
radar  coordinate  system  becomes  collapsed  in  the 
projection  process.  This  procedure  gives  the  slant- 
plane  coordinates  of  the  original  point  reflector  as 

T px  cos  0 cos  0  +  p v  sin  0  cos  0  -  p .  sin  Ml 


COS  0  COS  0 


where  k  is  the  normalizing  constant  required  to  make 
n  a  unit-length  vector.  The  value  of  k  is  given  by 


—kpx  sin  0  cos  0  cos' 0 

+  kp  (cos'  0  cos' 0  +  Sill'  0) 
+  kp.  sin  0  cos  0  sin  0 


sin'  0  +  cos'  0  cos'  0 


Wc  can  use  the  above  expression  for  the  range  and 
cross-range  coordinates  of  a  point  to  show  math- 


;0:  hV:  !>  \HV8I »  ’  !-h 


•  VERBOUT  FT  M. 

Improving  a  /  cmplatt’Ruicd  t  Lisitfier  in  a  SAR  Automatic  target  Recognition  System  by  (  >mg  A-J  )  I  urge  l  Information 


ematically  that  when  the  point  is  imaged  at  a  fixed 
aspect  angle  and  depression  angle,  but  at  different 
squint  angles,  it  will  appear  at  different  SAR  image 
locations. 

io  demonstrate  this  concept,  we  condu ct  two  im¬ 
aging  experiments  in  which  we  keep  the  aspect  and 
depression  angles  constant  but  allow  the  squint  angle 
to  vary.  In  particular,  for  the  first  imaging  experiment 
we  use  the  values  a  =  0,  0  =  jt/ 4,  and  <p  =  0;  tor  the 
second  experiment  we  use  the  values  a  =  0,  0  =  .t/4, 
and  ip  =  tr/4.  In  each  experiment,  once  these  angles 
have  been  fixed,  we  consider  imaging  a  point  p  that 
lies  on  the  principal  target  axis  in  the  ground  plane  at 
a  distance  of  one  unit  from  the  aimpoint.  Based  on 
the  diagram  in  Figure  8,  this  point  must  have  the 
coordinates 


For  the  first  imaging  experiment  (with  the  angles 
a  =  0,  0  =  rr/4,  <p  =  0),  we  can  readily  verify  from 
Fqunrion  I  that  k  =  1 ,  and  from  Equation  3  that 


We  can  now  compute  the  slant-plane  coordinates  of  p 
by  substituting  these  numerical  values  into  Equation 
2.  This  computation  yields  the  2-D  slant-plane  loca¬ 
tion  Sj  associated  with  the  first  set  of  imaging  angles; 
this  location  is  given  by 


For  the  second  imaging  experiment  (with  the  angles 
(t  =  0,  0  =  jt/4,  ip  =  rt/4),  we  find  that  k  =  2/ \/ 3  ,  and 


MUR  V'  1 


As  before,  we  compute  the  slant-plane  coordinates  of 
p  bv  substituting  these  values  into  Equation  2.  1  his 
yields  the  2-L4  slant-plane  location  s^>,  given  bv 


For  the  two  different  squint  angles  above,  the  range 
coordinate  of  the  imaged  point  remains  constant  but 
the  cross-range  coordinate  changes  dramatically.  This 
observation  is  an  example  of  the  more  general  result 
that,  given  a  fixed  radar  viewing  direction,  a  change  in 
the  squint  angle  causes  the  cross-range  coordinate  of  a 
point  to  change.  1'hus  the  above  example  provides 
quantitative  proof  that  the  slant-plane  location  of  a 
point  is  not  uniquely  determined  by  the  aspect  and 
depression  angles  alone. 

In  the  next  subsection,  we  give  a  simple  qualitative 
example  that  visually  demonstrates  the  effects  of  the 
squint  angle  on  the  appearance  of  a  SAR  image,  and 
thus  demonstrates  the  importance  of  incorporating 
information  about  the  squint  angle  into  the  baseline 
classification  algorithm. 

SAR  Imaging  Example 

Figure  9(a)  shows  a  perspective  view  of  a  simple  ob¬ 
ject  that  is  being  imaged  by  an  airborne  SAR.  The 
object  consists  of  a  square  grid  of  point  reflectors 
(shown  in  blue)  in  the  ground  plane,  and  one  addi¬ 
tional  point  reflector  (shown  in  red)  above  the  ground 
plane  and  directly  over  the  center  of  the  grid.  Figure 
9(b)  shows  a  top  view  of  the  same  imaging  configura¬ 
tion.  From  this  top  view,  we  can  see  that  the  grid  of 
point  reflectors  is  perfectly  aligned  with  the  projected 
radar  line  of  sight;  we  arbitrarily  define  this  orienta¬ 
tion  to  correspond  to  a  0°  aspect  angle.  The  object  is 
also  being  imaged  at  a  0°  squint  angle,  because  the 
radar  is  looking  in  a  direction  perpendicular  to  the 
line  of  flight. 

Figure  9(c)  shows  the  same  imaging  configuration 
once  again,  but  from  a  viewing  direction  perpendicu¬ 
lar  to  the  slant  plane.  Thus  we  see  the  projection  of 
the  object  onto  the  slant  plane,  which  (according  to 
our  mathematical  SAR  model)  corresponds  directly 
to  the  result  produced  by  the  SAR  imaging  process. 


62 


!kA'n*- 


vMlil'.'l  i>  MI'/HfK  i 


•  VfcRBOU  r  1 1  Al  . 

Improving  a  I  emplate  Ba^ed  ('Unifier  in  a  SAR  Automatic  larget  Recognition  System  by  l  j  tug  A-D  Target  Information 


Because  ot  the  projection  operation,  the  grid  ot  point 
reflectors  (in  blue)  appears  foreshortened  in  the  verti¬ 
cal  dimension,  and  the  point  reflector  above  the  ground 
(in  red)  appears  just  above  the  grid.  Finally,  Figure 
M(d)  shows  an  image-sized  portion  ot  the  slant-plane 
projection  (displayed  according  to  the  convention  that 
range  increases  in  the  downward  direction)  that  rep¬ 
resents  the  SAR  image  ot  the  object  at  a  0°  aspect 
angle  and  a  0°  squint  angle. 

Figure  10(a)  shows  a  perspective  view  ot  the  same 
object  being  imaged  with  a  different  SAR  geometry. 
For  this  example,  we  assume  that  the  slant  plane  has 
been  adjusted  so  that  the  depression  angle  matches 
that  ot  the  previous  example.  From  the  top  view 


(a) 


shown  in  Figure  10(b)  we  can  see  that  the  aspect 
angle  has  not  changed  < i.e..  the  grid  is  still  aligned 
with  the  projected  radar  line  ot  sight),  but  that  the 
squint  angle  has  changed  from  0“  to  45”. 

Figure  10(c)  shows  the  object  projected  onto  the 
slant-plane  under  this  new  imaging  configuration. 
1  he  grid  ot  point  reflectors  (in  blue),  which  appeared 
as  a  diamond  from  the  top  view,  now  appears  as  a 
foreshortened  diamond  in  the  vertical  dimension  be¬ 
cause  ot  the  projection  operation;  the  additional  point 
reflector  above  the  ground  plane  (in  red)  appears  over 
the  upper  corner  ot  this  diamond  because  ot  its  height. 
Figure  10(d)  shows  an  image-sized  portion  ot  the 
slant-plane  projection  (again  displayed  according  to 


(b) 


(d) 


FIGURE  9.  Illustration  of  SAR  imaging  as  a  projection  (broadside  case).  The  collection  of  point  reflectors  being  imaged  is 
shown  from  (a)  perspective  view,  (b)  top  view,  and  (c)  slant-plane  view,  (d)  The  resulting  SAR  image  can  be  interpreted  as 
an  image-sized  portion  of  the  slant-plane  projection,  as  indicated  by  the  orange  outline  in  part  c. 


63 


•  vtRBOirr  h  \  ai  . 

improving  a  l  emphiU-HtiH'd  (  li»ifiir  m  u  SAH  Autonuitu  largct  Ha  opinion  S \>n  >n  b\  /  >/■»/£.>  I)  I  nroii  Jnfvnnjnun 


the  convention  that  range  increases  in  the  downward 
direction)  that  represents  the  SAR  image  ot  the  object 
at  a  0°  aspect  angle  and  a  45°  squint  angle. 

I  he  primary  difference  between  this  SAR  image 
and  the  one  shown  in  Figure  9(d)  is  that  the  point 
reflectors  now  appear  shifted  in  the  cross-range  di¬ 
mension.  The  shift  for  each  reflector  is  not,  however, 
a  simple  function  of  the  range  of  the  reflector,  as  it 
may  appear  at  first  glance.  Rather,  the  shift  is  a  func¬ 
tion  of  the  3-D  location  ot  the  reflector,  which  is 
demonstrated  by  the  large  shift  of  the  point  reflector 
above  the  ground  plane.  This  shift  has  caused  the 
reflector  to  move  out  of  alignment  with  the  middle 
column  of  the  grid,  which  can  be  seen  by  comparing 
Figures  9(d)  and  10(d). 


Sensitivity  of  Baseline  Classifier  to  Changes  m  Squint 

The  simple  geometric  examples  given  in  Figures  9 
and  10  show  that  we  can  produce  two  different  im¬ 
ages  of  an  object  by  using  two  different  squint  angles 
for  a  Fixed  set  ot  aspect  and  depression  angles.  Front  a 
qualitative  standpoint,  these  differences  adversely  af¬ 
fect  the  performance  ot  the  baseline  classifier,  because 
the  classifier  has  only  one  reference  image  for  each 
aspect-angle  and  depression-angle  pair.  In  this  sec¬ 
tion,  we  describe  an  experiment  that  demonstrates 
quantitatively  that  the  classification  statistic  used  by 
the  baseline  classifier — -the  correlation  score — changes 
significantly  as  a  function  of  squint  angle  for  a  fixed 
aspect-angle  and  depression-angle  pair. 


FIGURE  10.  illustration  of  SAR  imaging  as  a  projection  (forward-looking  case).  The  sequence  of  figures — (a)  perspective 
view,  (b)  top  view,  (c)  slant-plane  view,  and  (d)  resulting  SAR  image — corresponds  directly  to  the  sequence  shown  in 
Figure  9. 


64 


•  VI-.KBOI'l  r  I  At .. 

hii/nnt  u/»  J  /(  Ki.'t/I  t  l.b'ltu)  U:  S  i  AV  i  .ti'-tl  Ml  V  s'.'. /.I",  ''let'll  !’\  i  '  !>  -t.il.'t 


Geometry  1 


Geometry  9 

4 


H  =  45- 


Geometry  17 


4 


Image  1 


Image  9 


Image  17 


FIGURE  11.  Depiction  of  the  imaging  configurations  used  to  quantify  the  robustness  of  the  baseline  classifier  with  respect 
to  variations  in  squint  angle.  The  test  images,  which  are  shown  notionally  below  their  respective  configuration  diagrams, 
were  all  formed  by  using  the  same  aspect  angle  (<<  =  45°)  and  depression  angle  (tf  =  45’),  but  each  had  a  unique  squint  angle 
</)  (a  multiple  of  5"  in  the  range  from  -40‘  to  +40’). 


We  conducted  the  experiment  In  using  a  target- 
signature  simulation  package  from  I  he  Analvtic  Sci¬ 
ences  (Corporation  known  as  SARTOOl  [2|,  which 
was  designed  to  model  the  dominant  electromagnetic 
characteristics  of  a  target.  We  used  the  SARTOOl. 
model  of  an  M48  tank  oriented  such  that  both  the 
aspect  and  depression  angles  were  fixed  at  45”.  We 
carried  out  the  experiment  hv  using  a  single  reference 
ima  ;e,  which  was  created  at  a  0”  squint  angle,  and  1” 
to.  images,  which  were  created  at  squint  angles  rang¬ 
ing  from  -40°  to  +40°  in  4°  increments.  1  igure  1  1 
illustrates  the  imaging  configurations  we  used  to  gen¬ 
erate  these  test  images.  (Note  that  the  aspect  angle 


and  the  depression  angle  are  the  same  for  each  con¬ 
figuration.)  I  he  experiment  consisted  of  correlating 
the  reference  image  with  each  of  the  test  images  to 
generate  a  plot  of  correlation  score  versus  squint  angle. 
1  his  plm  is  shown  as  a  solid  line  in  Figure  1 1. 

Of  course,  the  classifier  gave  perfect  performance 
( i.e. .  correlation  score  />  -  1.0)  for  the  test  image 

formed  at  a  0°  squint  angle,  because  the  reference 
image  was  formed  at  this  same  squint  angle.  I  he 
correlation  settle-,  progressivelv  decline,  however,  as 
the  squint  angle  for  the  test  set  varies  in  either  direc¬ 
tion  from  0".  At  the  extremes  of  -40"  and  +  i()".  the 
correlation  scores  are  approximated  0.5.  |  his  result 


(>5 


•  VERBOirr  IT  Al . 

hnprtn'iHg  a  / I’ltlpLin  (  Li^ifn  r  in  ,j  SAR  Aitlohniln  l titbi  t  Recognition  System  by  l  \///(*  'i  l )  / nry/et  1  nfonn.ilitm 


FIGURE  12.  Plots  of  correlation  score  versus  squint  angle 
for  the  baseline  classifier  (solid  line)  and  the  proposed 
classifier  (dashed  line).  The  test  images  used  for  the  ex¬ 
periments  were  those  described  in  Figure  11.  The  2-D 
template  used  by  the  baseline  classifier  and  the  3-D  tem¬ 
plate  used  by  the  proposed  classifier  both  had  aspect  and 
depression  angles  matching  those  of  the  test  images. 

suggests  that  the  performance  of  the  baseline  classifier 
is  sensitive  to  changes  in  squint  angle.  Thus,  to  im¬ 
prove  the  baseline  classifier  we  must  account  for  the 
effects  of  squint  (in  addition  to  the  already  recognized 
effects  of  aspect  and  depression)  on  the  process  of 
SAR  image  formation. 

How  to  Develop  a  Better  Classifier 

Our  analysis  in  the  previous  section  suggests  that  we 
can  improve  the  performance  of  the  baseline  classifier 
by  taking  into  account  the  effects  of  both  radar  view¬ 
ing  direction  and  radar  motion  direction  on  the  SAR 
imaging  process.  In  this  section  we  propose  a  new 
classifier  that  maintains  the  conventional  template¬ 
matching  engine,  but  calls  for  two  major  modifica¬ 
tions  to  the  baseline  classifier:  (1)  the  replacement  of 
the  present  set  of  2-D  reference  images  with  a  set  of 
3-D  templates,  and  (2)  the  incorporation  of  our  math¬ 
ematical  model  of  SAR  imaging  as  a  projection  so 
that  any  3-D  template  can  be  transformed  appropri¬ 
ately  to  synthesize  a  2-D  reference  image  for  the 
correlation  operation. 

We  begin  by  giving  a  definition  of  a  3-D  template, 
anJ  we  then  describe  how  a  3-D  template  is  trans¬ 
formed  into  a  2-D  reference  image.  In  addition,  we 


present  a  novel  technique  for  creating  3-D  templates 
from  our  existing  database  of  2-D  reference  images. 
Finally,  in  a  continuation  of  the  experiment  discussed 
in  the  previous  section,  we  show  that  the  proposed 
classifier  is  much  more  robust  with  respect  to  changes 
in  squint  angle  than  the  baseline  classifier. 

Description  of  3-D  templates 

A  3-D  template  is  a  finely  sampled  3-D  grid  of  points 
representing  the  volume  occupied  by  the  target  of 
interest,  in  which  each  grid  point  corresponds  to  a 
scatterer  on  the  target.  Each  3-D  template  is  associ¬ 
ated  with  a  distinct  radar  viewing  direction,  specified 
by  the  aspect  angle  a  and  the  depression  angle  0.  We 
thus  index  the  collection  of  3-D  templates  by  these 
radar  viewing  angles,  and  we  let  Tltl)  denote  the  tem¬ 
plate  corresponding  to  the  particular  pair  (a,  t>).  The 
value  stored  at  each  point  in  the  template  Tuft  repre¬ 
sents  the  radar  reflectivity  of  the  scatterer  at  that 
point,  when  the  target  is  illuminated  from  the  direc¬ 
tion  corresponding  to  (a,  0).  To  prepare  for  the  devel¬ 
opment  that  follows,  we  assume  that  the  template 
contains  K  grid  points.  The  location  of  the  y  th  point 
in  the  3-D  grid  is  denoted  by  p;,  and  the  radar- 
reflectivity  value  stored  at  this  point  is  denoted  by  /f;, 
fory=  1,...,  K. 

To  transform  Tu„  into  a  2-D  reference  image,  we 
use  our  projection  model  of  the  SAR  imaging  process. 
Specifically,  we  project  the  points  in  the  template  T<t„ 
onto  the  slant  plane  defined  by  the  depression  angle  0 
and  the  squint  angle  <p,  to  yield  a  reference  image  that 
we  denote  by  lallip.  Let  w,  n)  be  the  value  at  the 
range/cross- range  location  (w,  n)  in  this  reference 
image.  The  relation  between  the  values  in  the  tem¬ 
plate  7<t0  and  the  reference  image  value  laHf  w,  n)  is 
given  by 


where  w,  »)  is  the  set  of  indices  specified  by 


p;  projects  to  location 

C £«*,(»»«»)  =  - 

j 

(m.n)  in  the  SAR  image 
corresponding  to  \uJK  <p) 

66 


•  VERBOUT  E  l  AL. 

Improving  a  Template- Haseel  Classifier  in  a  SAR  Automatic  target  Recognition  System  by  Using  .>-/>  Target  Information 


Because  a  SAR  image  is  composed  of  discrete  pix¬ 
els  in  the  range  and  cross-range  dimensions,  the  loca¬ 
tion  (w,  n)  actually  corresponds  to  a  locus  of  points  in 
the  slant  plane.  Let  A,,  and  A,  be  the  range  and  cross¬ 
range  pixel  spacing  intervals,  respectively,  associated 
with  the  image.  Then  any  slant  plane  location  (qr,  r/  ) 
such  that 

s  qr  <  rm  +  A, 

li,  s  qt  <  e„  +  A,  , 

(where  rm  and  cn  are  appropriate  constants)  is  mapped 
to  SAR  image  location  (w,  «).  Thus  p-  projects  to  the 
SAR  image  location  (/»,  n)  if 

r,„  ^  P j  ■  r  <  rm  +  A,. 
c„  <  pj  ■  i.  <  c„  +  A,.  , 


where  r  and  c  are  the  unit  range  and  cross-range 
vectors  in  the  radar  coordinate  system  that  was  de¬ 
fined  earlier. 

Description  of  Proposed  Classifier 

Figure  13  illustrates  the  algorithm  used  by  the  pro¬ 
posed  version  of  the  baseline  classifier  (note  the  simi¬ 
larity  between  this  figure  and  Figure  3).  The  classifier 
uses  the  aspect  angle  «  and  the  depression  angle  0  to 
select  a  3-D  template  from  the  database.  The  points 
in  th  is  3-D  template  are  projected  onto  the  slant 
plane  specified  by  the  squint  angle  <p  to  produce  a  2-D 
image,  which  is  then  correlated  with  the  input  test 
image.  If  the  correlation  score  exceeds  the  threshold  r, 
the  target  of  interest  is  declared  to  be  present. 

Note  that  the  new  classifier  continues  to  use  the 
conventional  template-matching  engine,  so  that  the 
overall  structure  of  the  algorithm  is  unchanged.  The 


FIGURE  13.  Schematic  description  of  the  algorithm  used  by  the  proposed  one-class  classifier.  The  classifier  uses  the 
aspect  angle  a  and  the  depression  angle  H  to  select  a  3-D  template  from  the  database.  This  3-D  template  is  projected  onto 
the  slant  plane  specified  by  the  squint  angle  <p  to  produce  a  2-D  image,  which  is  then  correlated  with  the  input  test  image.  If 
the  correlation  score  />  is  greater  than  or  equal  to  the  threshold  r,  the  target  present  decision  is  declared. 


VO!  'J'.’F  I.  V’.'fifR 


l!([  1  AHI'I ^ A : 0 R •  .! B ! i R\ A , 


67 


•  VERBOUT  E  l  At. 

Improving  a  l  ton  plat? -Ha>t  (l  (Lisa  fin  in  a  SAR  Automatic  target  Recognition  System  by  i  sing  i-D  iargei  Information 


principal  modification  to  the  proposed  classifier  is 
that  the  original  database  of  2-D  reference  images  has 
been  replaced  by  a  database  of  3-D  templates.  In 
addition,  the  proposed  classifier  is  equipped  with  a 
processor  that  transforms  a  3-D  template  into  a  2-D 
image. 

Creation  of  3-D  Templates 

l.et  us  now  consider  how  a  3-D  template  can  be 
created  from  the  existing  set  of  2-D  reference  images. 
For  a  target  of  interest,  we  wish  to  construct  a  3-D 
template  corresponding  to  a  particular  aspect-angle 
and  depression-angle  pair.  To  perform  this  construc¬ 
tion  we  require  two  or  more  SAR  images  of  the  target, 
all  formed  by  using  the  fixed  aspect  and  depression 
angles  of  the  template,  but  each  formed  by  using  a 
different  squint  angle.  Recall  that  each  of  these  SAR 
images  represents  a  projection  of  the  3-D  distribution 
of  target  scatterers  onto  a  2-D  slant  plane.  Our  goal  is 
to  use  the  information  contained  in  these  projections 
to  reconstruct  the  locations  and  amplitudes  of  the 
target  scatterers. 

We  fix  the  locations  of  the  scatterers  p1 , ... ,  pA  so 
that  they  represent  a  uniform  sampling  of  a  parallel¬ 
epiped  that  is  approximately  the  size  of  the  target. 
Once  these  scatterer  locations  are  fixed,  we  then  solve 
for  the  unknown  amplitudes  /^corresponding  to  the 
points  p;.  Mathematically,  we  formulate  the  problem 
of  determining  the  /^values  in  the  following  way.  Let 
us  assume  we  have  L  actual  SAR  images  /, ,...,// 
formed  at  squint  angles  ,0/ ,  respectively.  Corre¬ 
sponding  to  this  sequence  of  actual  images,  we  let 
be  a  sequence  of  synthetic  images  formed 
from  the  template  amplitude  values.  By  using  our 
projection  model  for  the  SAR  imaging  process,  the 
value  in  the  ith  synthetic  image  corresponding  to  the 
range/cross- range  location  (m,  n)  is  computed  by 

J  Ar  (4) 

j&li  (>«•«) 

where  Qj(m,n )  is  the  set  of  indices  specified  by 


Qj  (w,  m) 


p  ■  projects  to  location  (w,«)  in  the 
SAR  image  corresponding  to  0( 


Let  the  total  mean-square  difference  between  the  set 
of  synthetic  images  and  the  set  of  actual  images  be 
given  by 

. A  )  = 

I  M  ,V 

^  ^  ^  £/,•(»/,//)  -  /,  (w,«)  :•  , 

I  — 1  oi~  1  n  =  1 

(3) 

where  A/and  A'are  the  range  and  cross-range  dimen¬ 
sions,  respectively,  of  the  SAR  images.  (Note  that  the 
dependency  of  r  on  each  A;  enters  Equation  5  implic¬ 
itly  through  Equation  4.) 

We  can  now  cast  the  template  construction  prob¬ 
lem  as  a  multivariable  minimization  problem.  Spe¬ 
cifically,  we  compute  the  values  of  /I,,...,  AK  by 
solving 

Minimize  e(A\, _ AK) 

such  that/f;  >0,  fory  =  1,2 . K . 

To  determine  an  optimal  solution,  we  begin  by  as¬ 
signing  to  each  unknown  amplitude  A/  an  initial  am¬ 
plitude  that  represents  our  best  a  priori  estimate  of  the 
actual  radar  reflectivity  at  that  point.  In  the  absence 
of  a  priori  knowledge,  we  assign  a  random  initial 
value  to  each  A-.  Figure  14  illustrates  the  iterative 
procedure  we  use  to  compute  the  template  amplitude 
values. 

The  points  p;  are  projected  onto  each  of  the  L  slant 
planes  (each  slant  plane  corresponds  to  an  actual  SAR 
image  supplied  to  the  algorithm),  which  results  in  a 
sequence  of  synthetic  images  that  can  be  compared  to 
the  actual  images.  The  total  squared  error  is  com¬ 
puted  from  these  two  sets  of  images  (synthetic  and 
actual)  and  rhe  amplitude  values  are  adjusted  such 
that  this  total  error  is  reduced.  The  iteration  then 
cycles  through  the  stages  of  synthetic  image  forma¬ 
tion,  error  computation,  and  amplitude  adjustment. 
The  procedure  is  terminated  when  the  total  squared 
error  is  less  than  some  prespecified  tolerance. 

Many  standard  gradient-descent  techniques  are 
available  for  implementing  this  iterative  minimiza¬ 
tion;  for  more  details  on  these  techniques  see  the 
book  by  D.Ci.  l.uenberger  [3]. 


.  ifl'Willln,  iil.iHV.;  •;<)!!! 7!  h 


68 


MlVBf  %  '■  ■ 


Real  image  1 


Real  image  2 


Real  image  L 


FIGURE  14.  Illustration  of  the  3-D  template-creation  procedure.  The  procedure  begins  with  a  random  assignment  of  radar- 
reflectivity  values  at  points  in  the  3-D  template;  thereafter  the  procedure  becomes  an  iterative  refinement  process.  The 
amplitudes  are  projected  onto  L  slant  planes  (each  slant  plane  corresponds  to  an  actual  SAR  image  supplied  to  the 
algorithm),  which  results  in  a  sequence  of  synthetic  images  that  can  be  compared  to  the  actual  images.  The  total  squared 
error  is  computed  from  these  two  sets  of  images  (synthetic  and  actual),  and  the  amplitude  values  are  adjusted  such  that  this 
total  error  is  reduced.  The  iteration  then  cycles  through  the  stages  of  synthetic  image  formation,  error  computation,  and 
amplitude  adjustment.  The  procedure  is  terminated  when  the  total  squared  error  is  less  than  some  prespecified  tolerance. 

angle.  This  section  summarizes  a  continuation  of  the 
experiment  in  which  we  measure  the  sensitivity  of  the 
proposed  classifier  to  changes  in  squint  angle. 

l  or  th is  experiment  we  used  the  same  set  of  1  7  test 
images  described  earlier.  Recall  that  these  images  were 
formed  bv  using  the  SARTOOI.  model  of  the  M48 
tank  oriented  such  that  both  the  aspect  angle  and 
depression  angle  were  fixed  at  45°.  These  images  were 
created  with  squint  angles  ranging  from  -40"  to  *40° 


Sensitivity  of  Proposed  Classifier  to  Changes  iti  Squint 

harlier  we  described  an  experiment  with  the  baseline 
classifier  that  provided  quantitative  proof  that  tile- 
correlation  score  varies  significantly  as  a  function  of 
squint  angle  for  a  fixed  aspect-angle  and  depression- 
angle  pair.  Because  the  correlation  score  is  the  classifi¬ 
cation  statistic  used  bv  the  baseline  classifier,  overall 
performance  is  extremely  sensitive  to  changes  in  squint 


69 


•  VERBOUT  E  l  M. 

Improving  a  Template- Host tl  C  lassifier  in  a  SAR  Automatic  larget  Recognition  System  by  l  ’sing  >-/>  larger  Information 


in  5°  increments.  Let  us  denote  the  nh  image  in  this 
set  bv  If ,  with  the  corresponding  squint  angle  <pt  given 
by  the  expression 

0,  =  -40  +  5(/  -  I),  i  =  1 . 17. 

Our  2-D  reference  image,  which  was  created  at  a 
squint  angle  of  0°,  was  replaced  by  a  3-D  template 


(b) 


FIGURE  15.  (a)  Simple  solids  model  of  an  M48  tank;  (b)  the 
same  model  of  the  tank  with  an  overlay  of  the  most  signifi¬ 
cant  radar-reflectivity  information  contained  in  the  3-D  tem¬ 
plate  for  the  tank.  (The  overall  intensity  of  the  underlying 
solids  model  has  been  reduced  to  give  emphasis  to  the 
radar-reflectivity  information.)  Note  that  there  are  signifi¬ 
cant  radar  returns  from  the  front  right  fender,  the  turret, 
the  track  region,  and  the  front  left  portion  of  the  tank. 


corresponding  to  an  aspect  angle  of  45°  and  a  depres¬ 
sion  angle  of  45°.  We  constructed  this  3-1)  temp 'ate 
bv  applying  the  algorithm  described  in  the  previous 
section  to  three  SARTOOL  images  formed  at  squint 
angles  of  -40°,  0°,  and  40°.  These  three  images  are 
displayed  in  Figures  7(b),  7(c),  and  7(d),  respectively. 

The  result  of  this  3-D  template  construction  is 
interesting  to  observe.  Recall  that  the  value  stored  at 
each  point  in  the  template  is  the  radar  reflectivity  of 
the  scatterer  at  that  location,  when  the  target  is  im¬ 
aged  from  the  given  viewing  direction.  Figure  13(a) 
shows  a  simple  solids  model  that  represents  the  basic 
features  and  dimensions  of  the  M48  tank;  Figure 
15(b)  shows  the  same  model  overlaid  with  the  most 
significant  radar-reflectivity  values  contained  in  the 
template.  Note  that  there  are  significant  radar  returns 
from  the  front  right  fender,  as  well  as  from  the  turret, 
the  track  region,  and  the  front  left  portion  of  the 
tank. 

In  our  previous  experiment  we  correlated  the  refer¬ 
ence  image  with  each  of  the  1 7  test  images  to  generate 
a  plot  (denoted  by  the  solid  line  in  Figure  12)  of 
correlation  score  versus  squint  angle.  In  this  experi¬ 
ment  we  first  transformed  the  3-D  template  into  a 
sequence  of  reference  images,  which  we  denote  by 

/,-,  corresponding  to  squint  angles  0, . 0,-. 

We  then  correlated  I,  with  /;,  tor  i  =  1 . 17,  to 

obtain  the  dashed  line  in  Figure  12. 

We  observe  in  this  plot  that  the  highest  correlation 
scores  occur  at  the  squint  angles  -40°,  0°,  and  +40°. 
This  result  is  not  surprising  because  the  template  was 
constructed  by  using  images  formed  at  these  three 
squint  angles.  The  average  correlation  score  obtained 
by  using  the  3-D  template  is  approximately  0.85, 
which  is  much  higher  than  the  average  correlation 
score  obtained  by  using  the  baseline  classifier  in  the 
previous  experiment.  Moreover,  the  scores  do  not 
change  significantly  as  the  squint  angle  varies  in  ei¬ 
ther  direction  from  0°,  which  suggests  that  the  pro¬ 
posed  classifier  is  more  robust  with  respect  to  changes 
in  squint  angle  than  the  baseline  classifier. 

Summary 

In  this  article,  we  have  analyzed  and  suggested  im¬ 
provements  to  a  conventional  template-matching  clas¬ 
sifier  currently  used  in  an  operational  ATR  system. 


•  VERBOUT  ET  Ai  . 

Improving  a  I  'em  plate- Bused  (.lassifwr  in  a  SAR  Automatic  'target  Recognition  System  by  L  Sing  Si)  larger  Information 


I  his  conventional  classifier  uses  a  collection  of  2-D 
SAR  reference  images  to  represent  a  full  range  of 
radar  viewing  directions  tor  a  prespecified  set  of  tar¬ 
gets.  For  each  target  category,  the  input  image  is 
correlated  with  the  reference  image  that  was  formed 
from  the  most  similar  radar  viewing  direction;  the 
input  is  then  classified  to  the  category  with  the  high¬ 
est  correlation  score. 

Although  this  algorithm  seems  reasonable,  we  found 
that  it  produces  surprisingly  poor  classification  results 
for  some  target  types.  We  explained  these  poor  results 
by  using  a  simple  mathematical  model  of  the  SAR 
imaging  process.  As  our  model  reveals,  radar  motion 
direction  is  as  important  as  radar  viewing  direction  in 
specifying  SAR  imaging  geometry.  Thus  two  target 
images  formed  with  the  same  radar  viewing  direction 
but  different  radar  motion  directions  can  appear  quite 
different.  Because  the  conventional  classifier  does  not 
explicitly  account  for  radar  motion  direction,  its  per¬ 
formance  is  degraded. 

Accordingly,  we  have  proposed  and  demonstrated 
an  improved  version  of  the  conventional  template- 
based  classifier  that  accounts  for  both  direction  pa¬ 
rameters.  In  our  improved  classifier,  each  2-D  image 
from  the  reference  library  is  replaced  by  a  3-D  tem¬ 
plate  so  that  more  target  scattering  information  is 
available  at  each  viewing  direction.  As  in  the  conven¬ 
tional  classifier,  the  reference  image  is  selected  on  the 
basis  of  radar  viewing  direction;  by  using  the  math¬ 
ematical  SAR  imaging  model  the  improved  classifier 
then  transforms  the  selected  3-D  template  to  a  2-D 
image  whose  radar  motion  direction  matches  that  of 
the  input  image. 

After  comparing  the  experimental  correlation  scores 
between  the  original  2-D  template-based  classifier 
and  the  improved  3-D  template-based  classifier,  we 
conclude  that  the  new  classifier  is  significantly  more 
robust  with  respect  to  changes  in  squint  angle. 

Acknowledgments 

I  he  authors  gratefully  acknowledge  the  encourage¬ 
ment  and  support  of  Leslie  M.  Novak  and  Gerald  B. 
Morse.  We  also  thank  Michael  C.  Burl  for  many 
technical  discussions  on  the  topic  of  SAR  imaging. 
This  research  was  sponsored  by  the  Advanced  Re¬ 
search  Projects  Agency. 


REFERENCES 

I.  J.t  .  Henry,  "  I  lie  1  ineoln  laboratory  H-tiH/  Airborne  I\>l.i 
rimetrie  SAR  Imaging  System,  IV.EC  Xanonal  lelesvstenc 
i  uuf .  AiLuua,  (iA,  26- Jr  Marti'  Pi'll.  p  tS  V 

1.  S.  Zabcle.  S.  Bachmsky.  8.  Myers,  A.  Stiehl,  and  R.  Pinto, 
Signature  Prediction  l  'sets  Manual,  version  I  (The  Analytic 
Science  Corporation.  Reading,  MA.  ld‘K)|. 

.1.  D.tj.  Lusiiberger,  Linear  and  Nonlinear  ISogruiuniinq  (Addi- 
son-VC'eslev,  Reading,  MA,  l '>84/. 


71 


•  VERBOUT  ET  AL. 

Improving  a  template- Based  Classifier  in  a  SAR  Automatic  target  Recognition  System  by  Using  3-D  t  arget  Information 


APPENDIX: 

MODELING  SPOTLIGHT  SAR  IMAGING 
AS  A  PROJECTION 


throughout  the  text  of  this  article  we  modeled 
the  SAR  imaging  process  as  a  projection  of  the  3-D 
distribution  of  target  scatterers  onto  a  2-D  slant  plane. 
We  relied  heavily  on  this  projection  model  as  we 
analyzed  problems  with  the  baseline  classifier  and 
developed  improvements  to  it.  In  this  appendix  we 
provide  justification  for  using  the  projection  model, 
and  we  state  the  conditions  under  which  this  model  is 
valid. 

Our  strategy  for  justifying  the  projection  model 
consists  of  four  main  steps.  In  the  first  step,  we  con¬ 
struct  the  basis  vectors  for  the  radar  coordinate  sys¬ 
tem  and  perform  the  projection  operation  on  a  point 
reflector  to  obtain  approximate  expressions  for  the 
SAR  image  location  of  the  reflector.  In  the  second 
step,  we  build  a  foundation  for  analyzing  the  projec¬ 
tion  approximations  by  writing  the  exact  nonlinear 
expressions  for  the  physical  quantities  that  are  mea¬ 
sured  by  a  SAR  when  imaging  the  point  reflector.  In 
the  third  step,  we  expand  these  nonlinear  expressions 
into  first-order  Taylor  series  in  the  vicinity  of  the 
radar  aimpoint,  and  then  observe  that  the  resulting 
linear  approximations  are  identical  to  the  original 
projection  approximations.  Finally,  in  the  fourth  step, 
we  quantify  the  accuracy  of  the  projection  model  by 
deriving  simple  bounds  on  the  approximation  error. 

We  begin  by  obtaining  expressions  for  the  pro¬ 
jected  location  of  a  point  reflector  in  the  radar  slant 
plane.  In  keeping  with  the  notation  we  established  for 
Figure  8,  we  define  the  position  of  the  point  reflector 
in  world  coordinates  as 


Px 


P  = 


Py 


Pz 


In  addition,  we  define  the  time-dependent  sensor 
position  in  world  coordinates  as 


s  (t)  = 


'v(') 

SyU) 

szU) 


Because  the  sensor  is  moving  at  a  fixed  altitude  paral¬ 
lel  to  the  /-axis  in  Figure  8,  we  explicitly  remove  the 
time  dependency  from  the  first  and  third  coordinates 
of  s(t)  by  setting  sx(t)  =  sx  and  sz(t)  m  r, . 

With  the  sensor  and  point-reflector  positions  de¬ 
fined,  we  can  now  construct  the  basis  vectors  for  the 
radar  coordinate  system.  Recall  from  the  main  text 
that  we  originally  expressed  the  basis  vectors  f ,  c,  and 
n  in  terms  of  the  imaging  angles  0  and  (p.  In  this 
section,  we  reconstruct  the  same  basis  vectors  r,  c, 
and  n,  but  we  express  them  in  a  form  that  is  more 
convenient  and  more  useful  for  our  derivations.  In 
particular,  rather  than  using  fixed  angles  from  a  single 
imaging  geometry,  we  express  these  vectors  in  a  time- 
dependent  form  in  terms  of  the  sensor  coordinates. 

At  time  A  the  range  vector  r(r)  can  be  constructed 
by  using  the  formula 


r(f)  =  - 


s(r) 

'<*>1 


-1 


Sx-  +  ^yU)  +  *Z 


Sv 


SyU) 


Also,  as  before,  the  slant-plane  normal  vector  n(/) 
can  be  constructed  by  using  the  formula 


nU)  = 


r(0  X  y 
r(r)  x  y 


Jr,.  +  r. 


-r. 


(Note  that  this  normal  vector  is  constant,  because  the 
radar  slant  plane  does  not  change  with  time.)  Finally, 
the  cross-range  vector  c{t)  is  determined  by  the  cross 
product  of  the  other  two  vectors,  as  given  by 


72 


I»f  NNCOIN  lARORMORV  JOURNAl  VOLUMt  fi  NllMBIR  i.  IWd 


•  VERJBOUT  ET  A L. 

Improving  a  Template- Based  Classifier  in  a  SAR  Automatic  Target  Recognition  System  by  Using  3-D  Target  Information 


c(r)  =  n(r)  x  r(r) 


\  S'x  +  T  V 


-  !sz  +  s~(t)  +  s; 


-V»W 


s\  +  Sr 


Sy('K 


The  projection  of  the  point  p  onto  each  of  these  basis 
vectors  yields  the  new  vector 


In  addition,  by  differentiating  the  above  expression 
with  respect  to  time,  we  can  write  the  relative  range 
rate  r[s(r),p]  of  the  point  reflector  as 

r[s(r),p]  =  —  r[s(/),p] 
at 

-  Py  _  CT 

|s(r)  -  p|  |s(r ) | 


qU)  = 


9r(t) 

<?,U) 

q,M) 


in  radar  coordinates.  In  particular,  the  range  and  cross- 
range  coordinates  of  the  point  p  are  given  by 


qr{t)  =  p  •  r(t) 


-1 


\iSx  +  sy(f)  +  St 


J  PxSx  +  P/y(*)  +  PzSz  ] 


and 


qc(t)  =  p  •  c(/) 


x  +  sz  yjsl  +  fy*)  +  sl 


'AvVyW  +  P  y  K  +  S~z)  -  PzSy(t)S2 


Having  obtained  these  projection  approximations 
for  the  range  and  cross-range  coordinates  of  the  point 
p,  we  now  seek  expressions  for  the  actual  quantities 
measured  by  a  SAR  with  respect  to  the  location  of  p. 
Specifically,  these  quantities  are  (1)  the  relative  range 
of  the  point  reflector  (i.e.,  the  difference  between  the 
distance  from  the  sensor  to  the  point  reflector  and  the 
distance  from  the  sensor  to  the  aimpoint),  and  (2)  a 
scaled  version  of  the  relative  range  rate  of  the  point 
reflector  (i.e.,  the  rate  of  change  of  the  relative  range 
with  respect  to  time).  We  can  express  the  relative 
range  of  the  point  reflector  as 

rfs(r),p]  =  Jls(r)  -  pll  -  ||s(/)||. 


In  reality,  the  relative  range  rate  is  rarely  used  in  its 
raw  form  as  it  appears  above,  because  it  is  so  highly 
dependent  on  both  ihe  speed  of  the  sensor  and  the 
distance  of  the  sensor  from  the  aimpoint.  Rather, 
these  undesirable  dependencies  on  the  absolute  sen¬ 
sor  velocity  and  position  are  usually  removed  through 
preprocessing,  so  that  the  cross-range  dimension  of 
the  resulting  SAR  images  is  normalized,  and  SAR 
images  created  under  uifferenr  imaging  conditions 
can  be  directly  compared.  Thus  we  introduce  two 
simple  time-dependent  corrections  to  f[s(r),p]  (one 
correction  for  the  absolute  sensor  velocity,  and  the 
other  correction  for  the  absolute  sensor  position)  to 
obtain  the  compensated  relative  range  rate ,  which  we 
denote  by  fr[s(f),p]. 

To  compute  the  correction  for  sensor  motion,  we 
begin  by  decomposing  the  sensor  velocity  vector  into 
two  velocity  components  in  the  slant  plane,  with  one 
component  along  the  radar  line  of  sight,  and  the 
other  component  orthogonal  to  the  radar  line  of  sigh:. 
We  then  note  (under  the  assumption  that  the  sensor 
is  far  from  the  aimpoint)  that  the  relative  range  rate  is 
affected  only  by  the  component  of  the  sensor  velocity 
vector  that  is  orthogonal  to  the  radar  line  of  sight. 
(This  statement  is  true  because  the  velocity  compo¬ 
nent  along  the  radar  line  of  sight,  considered  sepa¬ 
rately,  induces  exactly  the  same  range  rate  on  all  points 
in  the  imaging  area,  resulting  in  a  relative  range  rate 
of  zero  for  these  points.)  The  sensor  speed  in  the 
direction  orthogonal  to  the  radar  line  of  sight  can  be 
expressed  as 


Vx  + 


V 


SyU). 


VOUIVf  b  MIWBfR  i.  ?9<n  !«f  UNffUN  1  fiB(H4!0Rv  o;ir\Al 


73 


•  VhRBOUT  KI  Al.. 

hup nti'tng  a  I cmplatcliasal  (  bnoifier  in  a  SAR  Automatic  larger  Recognition  System  by  l  mg  AD  largct  Information 


Because  the  sensor  speed  appears  in  the  numerator  of 
the  expression  for  relative  range  rate,  the  above  com¬ 
pensation  for  sensor  speed  will  appear  in  the  denomi¬ 
nator  of  the  overall  range-rate  correction  term. 

To  obtain  the  correction  for  sensor  position,  we 
note  that  the  denominator  of  each  term  in  the  expres¬ 
sion  for  relative  range  rate  is  on  the  order  of  j|s(/)|. 
Thus  a  suitable  compensation  for  the  distance  of  the 
sensor  from  the  aimpoint  is  simply  this  factor  |s(r)|, 
which  will  appear  in  the  numerator  of  the  overall 
range-rate  correction  term.  By  applying  both  of  the 
computed  corrections  to  the  original  expression  for 
relative  range  rate,  we  can  express  the  compensated 
relative  range  rate  as 


r  |s(r),p| 


'!Mf 


r(s(/),  p} 


-|s(/)f  py(/)  ~  Py 

\S  +  s:  IN')  -  p| 


SyU) 

||S(/)| 


Having  produced  explicit  expressions  for  relative- 
range  and  compensated  relative  range  rate,  we  rewrite 
them  more  suggestively  as  range  and  cross-range  mea¬ 
surements  by  using  the  notation 

qrU)  =  r[s(r),p] 

and 

q,U)  =  r  |s(/),pl. 


We  now  show  that  these  actual  radar  measurements 
qr{t)  and  qt. (t)  are  well  approximated  by  the  previ¬ 
ously  computed  projections  qr(t)  and  q(  (t),  respec¬ 
tively.  We  begin  by  separately  expanding  the  expres¬ 
sions  for  qr(t)  and  q,.(t)  into  Taylor  series  around 
the  radar  aimpoint  (i.e.,  around  p  =  0),  retaining  only 
the  first-order  terms.  For  the  range  component,  this 
procedure  yields 


qr(t)  -  r(s(/),p]p=0  +  px  • 


f^|s(/),pl 


•'Px 


+  Py  • 


.•>r[s(r),  p) 


+  Pz 


p-0 

>r[s(/),p] 


p=0 


<>Pz 


p=0 


which  reduces  to 

q,  (/)  -  0  +  — - - - ;[/\\  +  PCD  +  /’A 

=  p  r(r ) . 

For  the  cross-range  component,  the  Taylor  series  ex¬ 
pansion  yields 


q,  (*)  ~  ^I*(/).p!|p_0  +  px  • 

«Vr.[sU),pl 


[s(/),pj 


•'•Px 


p  =  0 


+Py 


''Py 


+  P. 


<ir  [s(/),pj 


p=0 


<>Pz 


P“0 


which  reduces  to 


qt  (t)  =>  0  + 


\j.v +  c  +  *;<')  + 
-AvVy^)  +  pM  +  O  -  A  AT  fi) 

=  p  ’  %t) . 


Note  that  these  linear  Taylor  series  expansions  arc- 
identical  to  the  original  projections  q,\t)  and  qt  (/), 
indicating  that  qr(t)  and  qt  ( l )  are  good  approxima¬ 
tions  to  q,\t)  and  qt  (/ )  in  the  vicinity  of  the  radar 
aimpoint. 

Error  Analysis 

Let  us  now  quantify  the  error  incurred  by  using  these 
linear  approximations  instead  of  the  actual  measure¬ 
ments.  To  keep  the  analysis  concise,  we  examine  only 
the  error  in  the  relative  range  approximation.  To  sim¬ 
plify  notation,  we  arbitrarily  choose  a  time  t()  and 
remove  the  explicit  time  dependency  of  variables  by 
setting  s  =  s (t„),  qr  =  ?,.(*„),  and  q,  =  q,.{t()).  In 
addition,  for  convenience  we  express  the  length  of  p 
as  a  fraction  l)  of  the  length  of  s  (i.e.,  ||p||  =  a||s||),  and 
we  define  /J  to  be  the  angle  between  p  and  s. 

With  these  definitions,  we  can  rewrite  the  expres¬ 
sion  for  the  true  relative  range  measurement  qr  by 
using  the  law  of  cosines  to  yield 


H*l  |  !^:n;  \  i b h « a  ;o«:,  h  tr*. a ; 


74 


jl»:  :.Vf  <: 


•  VERBOUT  ET  AL. 

Improving  a  t  emplate- Based  Classifier  in  a  SAR  Automatte  Target  Recognition  System  By  Using  3-D  Target  Information 


ir  =  \IHi" +  ||p|r  -  2M1pIc°s0  -  [HI 

=  \isir + -  n 

=  |s|  |  \  1  +  d~  -  2d  cos  p  -  lj  . 

Also,  we  can  rewrite  the  expression  for  the  relative 
range  approximation  qr  bv  using  the  identity 

P  •  S  =  | pjj  ! s||  COS  p 

to  yield 

_  |p|||sj[cos^ 

'' " "  IN! 

=  -4!  cos  p . 

Because  qr  always  underestimates  cjr ,  we  can  write 
the  absolute  approximation  error  e  simply  as 

e  =  ir~  Vr 

=  (HI  (  \  1  +  d~  -  2d  cos  p  —  \  +  d  cos  p\  . 


For  a  fixed  value  of  d,  the  error  reaches  its  maxi¬ 
mum  value  at  the  angle  p  =  cos-1  (<5/ 2).  Substitut¬ 
ing  this  angle  back  into  the  formula  for  e  yields  the 
upper  bound 


For  the  specific  case  of  data  collection  with  the  Lin¬ 
coln  Laboratory  millimeter-wave  sensor,  d  is  typically 
no  larger  than  0.05,  so  the  error  incurred  by  using  the 
projection  approximation  for  a  given  point  p  is  no 
more  than  2.5%  of  the  distance  of  p  from  the 
aimpoint. 

The  derivation  of  the  error  bound  for  the  cross¬ 
range  approximation  is  similar  in  spirit  to  fne  deriva¬ 
tion  given  above,  but  it  is  much  more  lengthy  and 
tedious,  and  hence  is  omitted. 


JVsKk 


75 


•  VERBOUT  ET  AL. 

Improving  a  l  rmplali-Huud  (  Liuifivr  in  </  VIA*  Aulumitth  htrgei  Kaugnuum  .System  by  l  sing  A  l>  I  Jrgei  InftirmMwn 


SHAWN  M. VERBOUT 
is  .in  assistant  staff  member  in 
tilt1  Surveillance  Systems  pro  up. 
1  tis  research  interests  are  in  the 
development  of  algorithms  for 
signal  detection,  estimation, 
and  classification.  He  received  a 
B.S.  degree  in  mathematics 
from  Illinois  State  University, 
and  he  is  currently  pursuing  a 
master's  degree  in  electrical 
engineering  at  Mi  l'  as  a  mem¬ 
ber  of  the  Lincoln  Laboratory 
Staff  Associate  Program.  Shawn 
has  been  at  Lincoln  Laboratory 
since  1987. 


Wll.LIAM  W,  IRVING 

is  an  associate  staff  member  in 
the  Surveillance  Systems  group. 
His  research  interests  are  in  the 
areas  of  decentralized  Bayesian 
detection  theory  and  multi¬ 
resolution  stochastic  modeling. 
He  received  S.B.,  S.M.,  and 
L.L.  degrees  in  electrical  engi¬ 
neering  from  MI  L,  where  he  is 
currently  pursuing  a  I'h.I). 
degree  as  a  member  of  the 
Lincoln  Laboratory  Stall  Asso¬ 
ciate  Program.  Bill  has  been  at 
Lincoln  Laboratory  since  1  S>87. 


AMANDA  S.  HANES 
is  an  assistant  staff  member  in 
the  Surveillance  Systems  group. 
1  ler  research  interests  are  in  the 
detection,  discrimination,  and 
classification  of  stationary 
targets.  She  received  a  B.S. 
degree  in  mathematics  from 
butts  University,  and  she  also 
worked  at  the  Johns  Hopkins 
University  Applied  Physics 
Laboratory  through  the  Profes¬ 
sional  Summer  Student  Pro¬ 
gram.  Amanda  has  been  at 
Lincoln  laboratory  since  1989. 


76 


.III  \  I SHORC  [  lint  HMlHNAI 


VIM  if/I  I,  Cii'/liiM 


■  mm  > 


Neural  Systems  for  Automatic 
Target  Learning  and  Recognition 

Allen  M.  Waxman,  Michael  Seibert,  Ann  Marie  Bernardon,  and  David  A.  Fay 

■  We  have  designed  and  implemented  several  computational  neural  systems  for 
the  automatic  learning  and  recognition  of  targets  in  both  passive  visible  and 
synthetic-aperture  radar  (SAR)  imagery.  Motivated  by  biological  vision  systems 
(in  particular,  that  of  the  macaque  monkey),  our  computational  neural  systems 
employ  a  variety  of  neural  networks.  Boundary  Contour  System  (BCS)  and 
Feature  Contour  System  (FCS)  networks  are  used  for  image  conditioning. 

Shunting  center-surround  networks,  Diffusion-Enhancement  Bilayer  (DEB) 
networks,  log-polar  transforms,  and  overlapping  receptive  fields  are  responsible 
for  feature  extraction  and  coding.  Adaptive  Resonance  Theory  (ART-2) 
networks  perform  aspect  categorization  and  template  learning  of  the  targets. 

And  Aspect  networks  are  used  to  accumulate  evidence/confidence  over  temporal 
sequences  of  imagery. 

In  this  article  we  present  an  overview  of  our  research  for  the  past  several 
years,  highlighting  our  earlier  work  on  the  unsupervised  learning  of  three- 
dimensional  (3-D)  objects  as  applied  to  aircraft  recognition  in  the  passive  visible 
domain,  the  recent  modification  of  this  system  with  application  to  the  learning 
and  recognition  of  tactical  targets  from  SAR  imagery,  the  further  application  of 
this  system  to  reentry-vehicle  recognition  from  inverse  SAR,  or  ISAR,  imagery, 
and  the  incorporation  of  this  recognition  system  on  a  mobile  robot  called  the 
Mobile  Adaptive  Visual  Navigator  (MAVIN)  at  Lincoln  Laboratory. 


From  rut:  study  of  biological  vision  systems, 
we  can  learn  much  that  applies  to  the  design 
of  computational  neural  systems  for  target  rec¬ 
ognition.  These  insights  are  most  relevant  to  passive 
vision  systems,  such  as  visible  and  multispectral  infra¬ 
red  imaging  systems,  but  similar  organizing  principles 
are  also  useful  in  the  radar  imaging  domain.  In  the 
next  section,  we  summarize  the  primary  lessons  that 
have  been  learned  from  the  anatomical,  physiological, 
and  psychophysical  study  of  vision  systems  in  the 
macaque  monkey  and  man.  These  insights  are  then 
applied  throughout  the  remaining  sections  of  this 
review.  (Note:  An  introduction  to  biological  vision, 
learning,  and  memory  can  be  found  in  the  September 
1992  special  issue  of  Scientific  American,  which  is 
entitled  “Mind  and  Brain.  ) 


Design  Constraints  from  Biological  Vision 

The  vision  systems  of  primates  contain  two  primary 
processing  streams:  the  parvocellular  stream,  which 
processes  shape  information,  and  the  nmgnocellular 
stream,  which  processes  motion  information  (see  Ref¬ 
erences  I  and  2,  and  the  references  cited  therein). 
Both  streams  begin  in  the  retina  and  culminate  in  the 
parietal  and  temporal  lobes  of  the  cerebral  cortex. 
Our  automatic  target  recognition  (A  I  R)  systems  have 
focused  on  the  modeling  of  the  parvocellular  stream 
tor  the  learning  and  recognition  of  three-dimensional 
(3-D)  objects,  although  we  have  utilized  image  se¬ 
quences  to  accumulate  evidence  over  time.  I  he  image 
motion  of  objects  can  also  be  useful  for  recogniz¬ 
ing  potential  targets,  and  we  have  developed 


•  WAXMAN  HI  AH. 

Neural  System*  for  Automatic  large t  l  earning  and  Recognition 


neurocomputational  systems  [3)  to  extract  such  infor¬ 
mation  in  real  time  (30  velocity  fields  per  second)  on 
the  Pipelined  Image  Processing  Knginc  (PfPF.J,  a  video¬ 
rate  parallel-processing  computer.  The  integration  of 
an  object’s  image  motion  with  its  shape  information 
can  potentially  enhance  the  APR  process,  and  is  a 
topic  we  are  currently  investigating. 

The  early  visual  processing  that  takes  place  in  the 
retina,  lateral  geniculate  nucleus,  geniculo-cortical  con¬ 
nections,  and  visual  cortical  areas  VI,  V2,  and  V4  of 
the  occipital  lobe  are  responsible  for 

1.  conditioning  imagery  so  as  to  render  it  invari¬ 
ant  to  the  prevailing  illumination  (while  pro¬ 
ducing  smoothly  shaded  percepts  of  objects), 

2.  localizing  features  (such  as  edges,  high-curva¬ 
ture  points,  and  high-contrast  points)  that  de¬ 
scribe  2-D  shapes,  and 

3.  transforming  the  resulting  feature  pattern  so  as 
to  render  it  invariant  to  object  location,  scale, 
orientation  around  the  line  of  sight,  and  small 
deformation  due  to  any  foreshortening  resulting 
from  a  rotation  in  depth  (i.e.,  a  rotation  around 
an  axis  perpendicular  to  the  line  of  sight),  while 
still  retaining  measurements  of  these  spatial  at¬ 
tributes. 

These  invariant  representations  of  2-D  object  shapes 
make  their  way  to  the  inferior  temporal  cortex  via 
connections  between  the  occipital  and  temporal  lobes, 
whereas  the  location/scale/orientation  information  is 
relayed  to  the  posterior  part  of  the  parietal  lobe  via 
connections  between  the  occipital  and  parietal  lobes. 
Object-location  information  is  conveyed  to  the  pari¬ 
etal  lobe  also  via  the  superior  colliculus,  which  re¬ 
ceives  direct  connections  from  the  geniculate  nucleus 
and  is  intimately  involved  in  attentional  processes. 
These  two  cortical  pathways — one  subserving  object 
vision  (in  the  temporal  lobe)  and  the  other  subserv¬ 
ing  spatial  vision  (in  the  parietal  lobe) — have  come  to 
be  known  as  the  what  and  where  systems  [4],  Fusion 
of  the  what  and  where  information  is  achieved  via 
reciprocal  connections  between  the  temporal  and  pa¬ 
rietal  lobes,  as  well  as  by  indirect  connections  be¬ 
tween  other  regions  of  the  brain  such  as  the  hip¬ 
pocampus,  although  the  details  are  not  yet 
understood. 

Insight  into  the  later  stages  of  visual  processing  and 

78  'XI  !'»C0l\  I J8IM4W'  .W.'HMI  SIS  ll'.'t  6  SliVB?'*  '  ••I'U 


3-D  object  representation  can  be  gained  by  studving 
the  superior  temporal  sulcus  (SIS)  in  the  temporal 
lobe  of  the  macaque  monkey.  I  bis  area  is  known  to 
be  the  site  of  cells  tuned  tor  the  recognition  of  faces 
and  other  body  parts.  Of  course,  the  faces  that  a 
tnonkev  recognizes  are  indicative  of  the  monkey's 
visual  experiences,  and  reflect  the  visual  learning  pro¬ 
cess  itself.  We  have  learned  much  from  the  work  of 
D.l.  Perrett  and  his  colleagues  at  the  University  of 
St.  Andrews  in  Scotland  [3-8]. 

The  notion  of  cells  specifically  tuned  to  the  recog¬ 
nition  of  certain  objects  (analogous  to  the  orienta- 
tionailv  tuned  edge  sensitive  neurons  in  VI  discovered 
bv  D.  Hubei  and  1.  Wiesel  in  1934)  was  popularized 
by  H.  Barlow  in  1972,  and  became  known  as  the 
grandmother-cell  hypothesis ,  as  if  to  emphasize  that  a 
single  neuron  becomes  active  to  signal  the  recogni¬ 
tion  of  one’s  grandmother.  And,  tor  the  past  20  years, 
a  debate  has  raged  over  this  notion  of  single-cell  ver¬ 
sus  distributed-network  coding  of  visual  objects.  In 
fact,  this  seemingly  absurd  notion  of  single-cell  cod¬ 
ing  seems  to  have  much  supporting  evidence,  as  illus¬ 
trated  in  Perretts  work  below  (anu  confirmed  by 
other  investigators).  The  strict  notion  of  grandmother 
cells,  however,  must  be  reinterpreted  in  light  of  the 
fact  that  many  layers  of  processing  precede  the  view- 
specific  coding  of  objects,  and  a  hierarchical  pooling 
of  cells  is  required  to  influence  the  object-specific  cell. 
Moreover,  many  visual  objects  may  activate  this  cell, 
although  it  is  maximally  active  for  a  specific  ol 
ject,  whereas  other  cells  are  more  active  in  the  case 
of  the  other  objects.  Hence,  a  recognition  decision 
must  follow  a  neural  competition  between  grand¬ 
mother  cells,  and  possibly  an  evidence-accumulation 
phase  among  multiple  views  when  such  views  are 
available. 

Figure  1  (from  Reference  3)  illustrates  the  STS  area 
in  the  macaque  monkey  brain.  The  figure  shows  the 
locations  of  neurons  detected  by  Perrett  that  are  highly 
tuned  to  the  face  and  profile  views  of  heads,  rotations 
of  heads  between  specific  views,  and  conjunctions  of 
face  views  with  up/down/left/right  motions.  Perrett  s 
subsequent  work  [7]  indicates  the  existence  of  view- 
specific  cells,  each  one  tuned  for  a  particular  view 
around  a  certain  class  of  heads,  and  still  other  cells, 
called  view-general  cells,  that  respond  to  any  view  of  a 


•  WAXMAN  K  l  Al. 

■  \iHtiii  >}•  ti  Wi  for  AlttOhulIU  /.//£*  l  [  < *«' t'fi l rl'S  J'riii  A x  i  ; i HU} 


FIGURE  1.  View-based  coding  of  faces  in  the  temporal  cortex  of  the  macaque  monkey:  (a) 
lateral  view  of  the  monkey  brain,  (b)  coronal  cross  section  with  a  red  box  around  the 
superior  temporal  sulcus  (STS),  and  (c)  serial  sections  of  the  STS  area  investigated.  From 
left  to  right,  the  sections  illustrate  the  electrode  tracks,  cells  selective  to  face  views,  cells 
selective  to  profile  views,  cells  selective  to  transitions  between  views  during  head  rotations, 
cells  selective  to  faces  moving  left/right,  and  cells  selective  to  faces  moving  up/down. 
(Adapted  from  D.l.  Perrett  et  al.  [5],  with  permission  from  Trends  in  Neurosciences  Elsevier 
Science  Publishers  B.V.) 


specific  head  (as  if  the  view-general  cells  were  con¬ 
nected  to  all  of  the  corresponding  view-specific  cells). 
View-specific  cells  respond  to  the  same  face  views 
with  similar  activity  levels,  regardless  of  the  illumina¬ 
tion  strength  or  color,  the  si/e  or  2-D  orientation  of 
the  face,  and  the  position  of  the  face  in  the  field  of 


view.  Such  cells  have  apparently  learned  2-D-invari- 
ant  shape  codes. 

higure  2  (from  Reference  6)  provides  a  striking 
example  of  view  and  identity  coding  in  the  macaque 
temporal  cortex.  In  the  experiment,  a  monkey  was 
shown  different  views  of  the  faces  of  rwo  familiar 


9 


Spikes/sec 


(a) 


Spikes/sec 


(b) 


FIGURE  2.  View  and  identity  c  odmg  m  the  macaque  temporal  code*  lot  (<i)  subject  1  and  (It)  subject  2.  In  the  expenment.  a 
monkey  was  shown  different  views  of  the  faces  of  two  familial  people  (subjects  1  and  2),  and  the  act.vity  of  a  single  STS 
neuron  m  the  monkey's  brain  was  monitored  with  an  eleclncal  probe.  The  results  are  plotted  in  spikes.sec  rad. ally  fiom  the 
symbol:  the  black  solid  circle  denotes  the  spontaneous  background  activity  level.  The  experimental  measurements  are 
represented  by  the  large  red  dots  with  erroi  bars  indicating  standard  deviations  over  several  repeated  trials.  Note  that  the 
neuron  has  a  clear  preference  for  the  right  profile  view  of  subject  1 .  and  no  significant  response  to  any  view  of  subject  2 
(Adapted  from  Perrett  et  al.  |6).  with  permission  from  the  Jouin.il  of  Experimental  Biology,  the  Company  of  Biolo.  sts  Ltd  ) 


people,  w  liile  ,m  elec  trie.tl  probe  monitored  the  .ic  tiv 
itv  ot  a  single  S  I  S  neuron  in  the  monkey's  brain.  I  he 
results  are  shown  in  l  imine  2.  in  whieii  cell  aetbitv  is 
plot t eel  r.ulialb  from  the  “*  symbol  ami  the  solid 
circle  denotes  the  spontaneous  background  acti\it\ 
level.  I  he  experimental  measurements  are  represented 
In  the  large  dots  with  error  bars  indicating  standard 
ile'  iations  over  several  repeated  trials.  Injure  2tai  shows 
that  the  neuron  is  highly  tuned  for  the  right  profile 
vie''  ot  subject  1.  Nearby  views  I  come  at  a  m  angle 
from  the  right  profile)  soil  generate  cell  acti\  i t x .  though 
at  a  much  reduced  rate.  \ll  view  col  subject  2  in  t.uhei 
different  looking  face)  genet  ate  no  significant  .ictiy  tty 
above  i he  background  level,  as  shown  m  I  igure  2'bi. 

Ihiis  tins  neuron  might  sumedav  become  a  ■n.iUrl 
liitl't  r  1 1  //.' 

In  siininuiv.  monkeys  Kuril  to  ivcogm/c  laws  bv 
cmpli  iy  utg  a  view  base)  strategy.  Repri  se  m. u lolls  ot 
M  shapes  a  i  c  learned  and  stoicd  m  y  uvy  sp\ilic 
h  I  S  cells  I  lusc  sills  .  ode  shape  inhumation  that  is 
my. in, ml  to  illumm.it  ion .  position.  scale  oi  u  illation 


around  the  line  ot  sight.  and  small  loieshottcumg 
deformation.  Other  cells  code  transitions  between 
neighboring  clews  tint  have  been  exposed  bv  the 
rotations  of  a  subject  s  head.  \  htci.Uchic.il  combma 
(ion  ot  the  two  types  of  cells  allows  the  c  otistt  tic  <  mil 
ol  view  general  cells  tb.it  ate  sclccii.dv  .ictiv.ueel  lw 
spec  1 1  ic  heads  regard  I  ess  ot  t  he  v  ic  vc  mg  duec  t  ion.  litis 
same  strategy  tm  the  Ic.u  mug  and  Ucogm/mg  ot  '  1> 
heads  land,  possibly,  otbei  obiccts  can  be  applied 
i  rscl  1 1 1 1  v  to  the  design  ot  artificial  iiciual  systems  lor 

VI  K. 

Aircraft  Recognition  from 
Visible  Image  Sequences 

\\  e  designed  oil  I  tin  I  end  to  c  lid  \  I  Is  sv  'tc  m  till  I  lie 
passive  visible  domain,  and  applied  the  system  to 
high  e  out  i  .i  u  image  a  v  oi  mode  I  1  1  I  IN.  and  I  IK 

I  'spline  l  ii'nee  mu  i.itt  moving  against  tcxtuicd 
b.lc  kgi  Oil  lids  Note' ,  Me  tailed  eic  'e  I  ipt  loin  ,  it  I  hi.  tie  ;; 
l.tl  svsletn  ale  C I  lilt  a  I  lie  d  III  v  el  a  I  pap.  n  ! '  \  1  V  ibe  I ! 

and  \.\1  Waxm.ni  '■  1  1 


•  WAXMAN  £T  Al  . 

Neural  Systems  for  Automatic  I  arget  l  earning  ansi  Recognition 


Figure  3  provides  a  conceptual  overview  of  the 
system,  in  which  a  temporal  view  sequence  of  an 
object  leads  to  the  learning  of  an  aspect  graph  (I2j 
representation  of  that  3-D  object.  We  can  divide  the 
system  into  three  main  functional  stages,  the  first  of 
which  performs  2-D  view  processing  to  extract  fea¬ 
tures  (invariant  to  illumination)  from  the  individual 
images,  group  these  features  to  locate  object  position, 
and  transform  the  features  to  render  the  pattern  in¬ 
variant  to  scale,  orientation,  and  small  deformation, 
i  he  second  stage  takes  these  invariant  feature  pat¬ 
terns  and  clusters  them  into  categories  of  similar  views, 
or  aspects.  I  his  2-D  view  classification  is  done  in  an 
unsupervised  way;  i.e.,  it  is  strictly  data  driven  with¬ 
out  any  category  definition  by  a  human.  Along  with 
the  learning  of  these  aspect  categories,  a  prototype 


feature-pattern  template  is  established  for  each  cat¬ 
egory'.  1  he  aspect  categories  correspond  to  the  nodes 
of  an  aspect-graph  representation  of  the  target;  they 
also  plav  the  role  of  view-specific  cells  for  aircraft.  I  he 
third  stage  detects  the  transitions  over  time  between 
aspect  categories  (while  the  target  is  tracked  in  relative 
motion),  learns  these  transitions,  and  accumulates 
evidence  tor  possible  targets.  I  he  learned  transitions 
are  like  the  arcs  that  connect  the  nodes  in  the  aspect- 
graph  concept,  and  are  reminiscent  of  the  S  I  S  neu¬ 
rons  that  are  activated  bv  the  rotation  of  the  heads 
between  views  in  Figure  1 .  I  he  ability  to  accumu¬ 
late  evidence  over  time  is  significant,  lor  there  are 
often  cases  in  which  a  single  view  of  a  target  is  not 
sufficient  to  identify  the  target  unambiguously;  more¬ 
over,  this  fusion  of  evidence  leads  to  a  notion  of 


2-D  view  2-D  view  3-D  object 

processing  classification  hypotheses 

•Feature  I  ‘Pattern  I  'Transition 

extraction  |  encoding  |  detection 

View 
sequence 


(b) 

FIGURE  3.  Conceptual  approach  of  ATR  neural  system  for  passive  visible  image  sequences:  (a)  temporal  view  sequence  of 
images  and  corresponding  aspect  graph,  and  (b)  functional  block  diagram  of  system.  As  a  target  moves  relative  to  an 
observer,  qualitatively  different  views  are  exposed  in  a  temporal  view  sequence.  The  views  unfold  in  an  orderly  fashion  that 
is  represented  in  the  aspect  graph.  Each  image  in  the  sequence  is  processed  by  three  stages  of  networks  performing 
feature  extraction  and  invariant  mappings,  classification  of  feature  maps  into  aspect  categories,  and  3-D  ob|ect  evidence 
accumulation  from  the  recognition  of  categories  and  transitions  The  learned  categories  and  transitions  are  analogous  to 
the  nodes  and  arcs,  respectively,  of  an  associated  aspect  graph. 


•Position, 

scale, 

orientation,  and 

deformation 

invariance 


•Aspect 

learning 

and 

recognition 


•Transition 

learning 

•Evidence 

accumulation 


Object 

recognition 


81 


•  WAXMAN  H  Ai . 

.\'cnr,il  for  Auloouiu r  forget  /  arnnig  iin<{  Rnogmliort 


confidence  in  the  recognition  decision. 

I  hese  three  processing  stages  can  each  be  realized 
with  multiple  neural  networks,  and  together  the  net¬ 
works  comprise  a  neural  system  architecture,  as  shown 
in  Figure  4.  Here,  each  module  is  an  individual  net¬ 
work  that  is  annotated  by  the  module's  functional 
role  in  the  system.  Two  processing  streams  are  shown: 
the  gray  modules  form  a  parvocellular  stream,  and  rhe 
red  modules  form  an  intentional  stream. 

In  the  system,  images  are  captured  with  a  conven¬ 
tional  GCD  camera  (which  could  be  replaced  by  an 
infrared  imaging  system)  and  objects  are  segmented 
from  the  background  by  using  a  combination  of  mo¬ 
tion  and  contrast  information.  Next,  a  shunting  cen¬ 
ter-surround  network  enhances  the  edges  of  the  seg¬ 
mented  object,  and  a  Diffusion- linbancement  Bilayer 
(DF.B)  extracts  and  dynamically  groups  the  feature 
points  of  high  edge  curvature  into  a  position  cen¬ 
troid,  as  shown  in  Figure  5.  I  hese  networks  form 
nonlinear  dynamical  systems  in  which  individual  nodes 
are  governed  bv  (Hodgkin-Huxley-like)  cell-mem¬ 
brane  equations  that  resemble  the  charging  dynamics 
of  coupled  resistor-capacitor  networks.  (See  Refer¬ 
ence  13  by  S.  CIrossberg  for  a  review  of  his  pioneering 
work  on  dynamical  neural  networks,  including  shunt¬ 
ing  center-surround  networks.  Also,  see  References 
14  and  I  3  for  a  reformulation  of  the  DF.B  in  terms  of 
coupled  dvnamical  layers  of  astrocyte  glial-like  diffu¬ 
sion  cells  and  neural-like  contrast-enhancing  cells,  all 
inspired  by  biology  and  applied  to  the  psychophysical 
percept  of  long-range  apparent  motion.) 

I  he  centroid  determined  by  the  DF.B  network  is 
used  to  track  and  fixate  the  object,  and  ser  es  as  the 
origin  of  a  log-polar  transform  of  the  cxtracted-fea- 
turc  map.  I  his  transformation  is  verv  closelv  approxi¬ 
mated  by  the  axonal  connections  between  the  lateral 
geniculate  nucleus  and  the  primary  visual  cortex  V) 
i  lb|.  In  our  system  the  transformation  serves  to  con¬ 
vert  changes  in  2-1 )  scale  and  2-1)  orientation  ol  the 
visual  feature  map  into  a  translation  along  new  or¬ 
thogonal  axes  I  hese  processing  steps  are  illustrated 
tor  an  I  -  IS  silhouette  in  Figures  6(a),  ( b ) .  and  (c). 

I  he  log-polar  feature  map  (periodic  in  orientation 
angle  l>  )  is  then  input  to  a  second  l  M  B  to  determine 
a  new  feature  centroid  in  the  transformed  coordi¬ 
nates.  I  he  spatial  pattern  of  features  now  represents 


the  original  view  of  the  F-18  invariant  to  illumina¬ 
tion,  position,  scale,  and  orientation. 

I  he  next  layer  of  processing  indicated  in  Figure  4 
consists  of  overlapping  receptive  fields:  the  processing  is 
aligned  with  the  centroid  that  was  detected  on  the 
log-polar  map,  and  serves  to  rv”d"r  the  feature  pat¬ 
tern  somewhat  insensitive  to  nonlinear  spatial  defor¬ 
mation.  In  the  processing  (Figure  7),  a  small  array  of 
Gaussian-weighted  overlapping  receptors  are  excited 
by  the  underlying  features  in  the  log-polar  map,  and 
the  output  of  the  array  provides  a  much  compressed 
code  of  the  spatial  feature  pattern.  (An  individual 
receptor  is  activated  by  the  feature  within  the  receptor’s 
field  that  lies  closest  to  the  fields  center,  and  the 
feature’s  distance  is  coded  according  to  a  Gaussian 
falloff.)  I  his  compressed  code  is  illustrated  for  the 
F-l<8  in  Figure  6(d)  for  the  case  of  a  3  x  3  array  of 
overlapping  receptors.  In  the  figure,  the  sizes  of  the 
dots  correspond  to  the  receptor  activation  level:  the 
larger  the  dot,  the  greater  the  activation.  This  coarse 
coding  of  spatial  feature  parterns  simultaneously  pro¬ 
vides  tor  enormous  data  reduction  from  the  original 
target  image  (compared,  for  example,  with  a  direct 
template-matching  approach),  leads  to  a  tolerance  for 
small  deformations  due  to  rotations  in  depth  and 
inaccurate  feature  extraction,  and  yields  an  input  vec¬ 
tor  for  the  classification  network  that  forms  the  next 
system  module. 

I  he  later  stages  of  vision  support  the  learning  and 
recognition  process.  In  our  system,  learning  and  rec¬ 
ognition  are  realized  by  two  modules  consisting  of  an 
Adaptive  Resonance  Iheory  network  (cf.  seveial  papers 
on  various  ARI  networks  in  Reference  17)  and  an 
Aspect  act  work  1 1  0,  11], 

Figure  8  illustrates  the  ARI -2  architecture  for  un¬ 
supervised  category  learning  and  recognition.  (Note: 
ART-2  is  one  implementation  of  Adaptive  Resonance 
I  hcorv  for  patterns  consisting  ol  real  numbers.)  The 
ART- 2  network  takes  an  A'-dimensional  input  vector 
(in  our  case,  the  overlapping  receptive  field  pattern 
with  dimension  of  order  10  to  100)  and  first  pro¬ 
cesses  it  through  circuitry  that  contrast -enhances  and 
normalizes  the  input  as  a  short-term  memory  (S  I  M) 
pattern.  ARI -2  then  passes  this  p.utcrn  through  a 
bottom-up  filter  (or  template)  stored  in  long-term 
memory  (I  INI)  to  excite  a  field  of  SIM  category 


82 


•  WAXMAN  H  AL. 

Seural  Systems  for  Amumutn  lurgn  l  turning  anti  Rt\ognitum 


Object -brio,  kg  round 
separation 


Segmentation 

network 


Edge  enhancement 


Center-surround 

network 


Feature  extraction 
and  grouping 


Orientation  and 
scale  invar  innce 


V ioc.  coding  and 
tiefoi  matron 
invariance 


Vmw  ieai nuut  and 
recognition 


Go  e  cu;  i:  a  re  C 


Transient  detector 
Diffuse-enhance  network 


Ob;-".'!  ear  ninij  am. 


Recognized 

object 


FIGURE  4.  Modular  system  architecture  for  the  learning  and  recognition  of  3-D  targets  from  visible  imagery.  The  system  is 
organized  into  two  streams  of  neural  network  modules:  the  gray  parvocellular  stream  for  invariant  shape  learning  and 
recognition,  and  the  red  attentions!  stream.  The  functional  role  of  each  module  is  indicated  along  with  the  type  of 
network. 


83 


Edge  enhancement 


Diffusion 


i 

Local  maximum 
detection 


(a) 


Edge-enhanced  image 
with  maxima  superimposed 


FIGURE  5.  Diffusion -Enhancement  Bilayer  (DEB)  for  feature  extraction  and  grouping:  (a)  architecture  diagram  and  (b) 
evolving  map  of  high-curvature  points.  The  first  stages  of  processing  are  accomplished  by  center-surround  networks  to 
odge-enhance  tbe  segmented  ob/nct,  and  a  diffusion-enhancement  network  to  isolate  points  of  high  curvature  along  the 
silhouette.  These  feature  points  are  dynamically  grouped  into  a  centroid  (providing  a  focus  of  attention)  by  another  DEB, 
which  couples  a  diffusion  layer  to  a  contrast  enhancing  layer  in  a  feedforward  and  feedback  configuration.  (For  a  detailed 
description  of  DEBs.  see  References  9,  14.  and  15.) 


•  WAXMAN  I  I  AL. 

Neural  Systran  for  Automata  target  Learning  and  Recognition 


FIGURE  6.  Stages  in  the  processing  of  a  2-D  view  of  a  model  F-18  aircraft:  (a)  the  original  image,  (b)  the  edge-enhanced 
silhouette  with  DEB  features  superimposed  and  the  centroid  indicated  with  a  (c)  log-polar  mapping  of  the  image  in 
part  b,  with  the  new  centroid  indicated  with  a  and  (d)  the  resulting  output  of  a  5  x  5  array  of  overlapping  receptive  fields 
(see  Figure  7)  that  forms  the  pattern  fed  to  the  Adaptive  Resonance  Theory  (ART-2)  network.  In  the  image  in  part  d,  larger 
dots  represent  greater  activity  in  the  corresponding  receptive  fields. 


nodes  (our  view-specific  cells,  or  aspect  nodes).  These 
category  nodes  compete  among  themselves  to  choose 
a  maximally  activated  winner,  which  in  turn  activates 
top-down  feedback  of  a  learned  template  also  stored 
in  I.TM.  This  feedback  represents  the  network’s  ex¬ 
pectation  of  a  specific  input  pattern.  A  vigilance  pa¬ 
rameter  f>  (in  the  interval  0  to  1 )  that  is  set  in  advance 
by  the  user  mediates  the  matching  of  the  enhanced 
input  pattern  with  the  top-down  template.  Thus, 
simply  having  a  best  match  among  already  established 
categories  is  not  enough;  rather,  the  best  match  must 
satisfy  the  established  vigilance.  When  the  match  does 
satisfy  the  vigilance  criterion,  the  network  goes  into  a 
state  of  resonant  oscillations  between  layers,  and  the 
bottom-up  and  top-down  filters  adapt  slightly  for 
L^  .ter  representation  of  the  recent  input  pattern.  When 
the  vigilance  criterion  has  not  been  met,  the  network 
generates  a  reset  signal  that  flips  the  category  field, 
thus  suppressing  the  recent  winner  and  reactivating 
the  former  losers.  In  this  way,  an  uncommitted  cat¬ 
egory  node  can  establish  a  new  category  and  a  new 
template  can  be  learned.  ARI-2  has  several  important 
attributes  that  make  it  particularly  well  suited  to  AI  R 
applications:  it  supports  on-line,  real-time,  unsuper- 
vised,  stable  category  learning  and  refinement.  We 
have  utilized  ART-2  successfully  in  a  number  of 
applications. 

lb  present  our  results  for  the  ART- 2  classification 
of  different  aircraft,  we  introduce  the  concept  of  a 
viewing  sphere,  as  illustrated  in  figure  9.  Note  that  a 


Output  of 
3x3  array  below 


FIGURE  7.  Spatial  coding  of  features  by  overlapping  re¬ 
ceptive  fields.  Each  circular  field  is  activated  according  to 
a  Gaussian-weighted  distance  to  the  point  feature  that  is 
closest  to  the  receptor  center.  (Note:  Lighter  colors  in  the 
figure  represent  closer  distances.)  These  receptors  pro¬ 
vide  enotmous  data  compression,  and  they  code  spatial 
relations  of  features  robustly  with  respect  to  deformations 
due  to  foreshortening.  The  fields  convert  a  binary  feature 
map  to  an  analog  pattern  that  is  then  suited  for  ART-2 
classification. 


•  WAXMAN  U  At. 

iXl'llftll fill  Autuiiuttti  !  •<!  i(<  l  l 


Gain  control 


Input  pattern 
(a) 


Input  pattern 
(b) 


location  on  the  viewing  sphere  lor  an  example  aircraft 
corresponds  to  the  view  ot  that  aircraft  as  seen  from 
that  particular  direction.  Using  this  viewing-sphere 
concept,  Figure  10  summarizes  the  results  of  feature 
extraction,  coding,  and  ARI-2  classification  lor  an  I 
18  model  aircraft.  With  5 .VS  input  views  of  the  I  -  18 
and  a  vigilance  />  ot  03)3,  ARI-2  generates  1  2  catego¬ 
ries  of  the  aircraft.  In  f  igure  10(a),  the  categories,  or 
aspects,  are  shown  color  coded  on  an  aspect  sphere 
with  12  different  unrelated  colors  ( i . e . ,  a  dark  blue 
has  no  relation  to  a  light  blue).  Note  that  the  aspects 
subtend  finite  solid  angles  on  the  sphere  (the  target  is 
oriented  with  its  nose  to  the  left),  because  ot  object 
silhouette  symmetry,  only  one  quadrant  ot  the  sphere 
is  shown.  We  can  visualize  example  silhouettes  that 
correspond  to  the  12  categories  by  selecting  locations 
on  the  aspect  sphere  falling  at  the  centers  of  each  of 
the  established  categories,  as  shown  in  Figure  10(b). 
The  corresponding  silhouettes  (numbered  1  through 
1  2  in  f  igure  l()|c|)  represent  prototype  views  that  the 
system  has  created  in  an  unsupervised  manner.  No¬ 
tice  the  variety  of  silhouettes  selected:  some  prototype 
views  capture  the  wing  shapes,  some  capture  the  double 
tail  fins,  some  capture  the  dual  exhausts,  while  others 
emphasize  traditional  top  and  side  views.  Also  note 
the  similarity  between  silhouettes  2  and  5,  given  the 
proximity  of  their  corresponding  centroids  in  Figure 
10(b).  Yet,  although  similar,  silhouettes  2  and  3  do 
exhibit  subtle  differences,  e.g.,  the  differing  slopes  ot 
the  top  portion  of  the  visible  tail  fill.  All  of  the  views 
in  l  igure  10(c)  were  selected  automatically.  When  the 
vigilance  p  was  increased  from  02)3  to  03)3,  the 
ARI-2  network  generated  24  categories. 

In  addition  to  the  F-18,  we  have  also  investigated 

FIGURE  8.  ART-2  network:  (a)  architecture  and  (b)  circuit 
model.  ART-2  takes  analog  input  patterns  and  clusters 
them  into  categories  by  using  unsupervised  competitive 
learning.  ART-2  can  be  trained  on  a  dataset,  then  used  to 
recognize  data  patterns  in  the  field  while  continuing  to 
refine  its  learned  category  representations  (i.e.t  templates) 
stored  in  its  adaptive  synapses.  The  vigilance  parameter  p 
mediates  the  matching  of  the  enhanced  input  pattern  stored 
in  short-term  memory  (STM)  with  a  learned  template  from 
long-term  memory  (LTM).  (Adapted  from  G.A.  Carpenter 
et  al.  1 1 7 1,  with  permission.  This  reference  also  contains  a 
detailed  description  of  ART.) 


86 


XV  \\\1  Vs  II  \ I 


FIGURE  9.  Example  viewing  spheie  lot  a  tiyhtei  airciaft.  Note  that  a  location  on  the  spheie  i  otresponds  to  the  vie.-,  of  the 
a  it  cruft  as  seen  fiom  that  paitn  ular  cl  net  tion  Silhouettes  of  the  aircraft  aie  shoe,  n  from  different  viewing  di  lections  The 
silhouettes  were  obtained  fry  apply i n c.)  thresholds  to  imagery  that  was  captured  with  a  rhai cie-r oupled  device  (CCD) 
cameia  and  fiame  yiahher.  (The  jagged  contents  inflect  the  finite  pixel  sizes  of  the  CCD  imacjee) 


\R  I  2  v  I. issiliv.it inn  lor  an  I  •  1  (>  .uul  II  K  I  I  Spruce 
(■onset  model  at  is  i  .it  ( .  as  shown  in  I  ip,uie  II.  Ap 
proxim.uelv  SOU  views  of  e.tvh  ait v rat i  were  collected 
anil  processed  w  it  1 1  a  sonde  Alx  I  2  module.  (  I  hi-  next 
six  lion.  "  I  ,ti  tiial  I  a  reel  Reopen  It  inn  in  tin  Synthetic 
Aperture  Rad.u  SAlx  Spotlight  Mode.  presents  an 
alti  i  nat  i\ e  xtratepv  ot  uxine  one  Alx  l  -d  moilule  pci 
I  a  reel  lot  the  SAlx  applii.it  ion.  I  All  the  \  tew  x  (needier 
resulted  m  onlv  fl  independent  categories  at  a  vicj 
lame  /'ot  D.Ds.  I  Nole:  lieu  re  ID  unexttcateil  die 
v  ateeori/at ion  ol  a  sineR-  tareet  In  usms;  \  m>s  ol  just 
that  l.uui-t.  lints,  at  the  same  viejl.ime  setttnj;  ol 
D.D  i,  \R  1  d  eiiierated  onlv  1  2  i.iteeortex,  in  nimrasi 
to  till-  d(>  l.ltl-eolles  of  1  |el||i  I!  a.)  1  111-  ituliv  lllu.ll 
.ispet  t  sphetes  show  the  ximtlamv  m  i.tii-eorv  lax  out 
between  the  two  (telnet  . util. tit.  and  tin-  olntous 
dtlli  letu  es  bet  w  sen  Itphtet  and  t  tan  sport  like  at  ti  tali. 

I  lie  sphetes  also  mdtv.lte  how  nit. nil  views  ol  the 
two  lielttefs  ate  amluryioiix.  at  least  m  terms  ol  tin 
teat  m  v  x  ext  t  a.  u  d  In  [Insistent.  I  he  tin  In  idual  i  I  > 


tarevts  are  represented  In  touelilv  dS  i.itepirics  e.tilt. 
Noti-  m  I  letire  II  that  some  ol  the  i.Keuotiex  ate 
lommon  to  two  or  more  of  tin-  tareetx:  t.e..  the  ln.;ht 
vellow  m  I  leutv  1  I  ■  a  •  silt  responds  to  the  same  v  at 
i-eotv  that  is  represented  In  the  s.tme  Itpht  vellow  m 
I  [pure  1  1 1  h). 

I  hi-  aspect  spheres  m  ! -inure  I  1  also  tlluxiiau-  the 
neielthor  relations  amone  v.iu-eortex  as  one  rotates  ot 
explores  a  tareet  m  A  i  ).  I  luxe  netulthoi  illations 
sonespond  to  permitted  tiansitions  amone  iau;;o 
ties,  and  are  learned  and  exploited  In  out  .  b/>< .  /  mi 
work.  Much  like  the  S  I  S  n  ils  that  code  v  tew  n. nisi 
lions,  and  the  hierarchic  al  poolmp  ol  view  specific 
nils  to  liinn  view  eeneral  ol>|nt  specific  nils,  out 
\spix  i  ne tw orks  self  o|e,int/e  into  i ounce  t toils  amone 
aspect  c.itvcot,  nodes  that  prekti  tit  tails  iliannel  ,n 
nv  it v  into  c nr  respond  me  i  l  )  oh|n  l  nodes  vv  lu  u  sin 
nxxiie  aspects  on  tit  in  a  pcimitlcd  xd|ucinv  I  In 
\spn  t  networks  learn  these  aspn  I  tiansitions  nun 
memallv  ditiini;  lontiolkd  n.iiniin:  sessions,  or  mi 


•  WAXMAN  El  At . 

Xi'ttrul  jur  Autonuitii  1  iirgt't  1  ranting  unit  AY,  thulium 


FIGURE  10.  Results  of  feature  extraction,  coding,  and  ART-2  classification  of  an  F-18  model  aircraft  alone  at  a 
vigilance  p  of  0.93:  (a)  aspect  sphere  showing  the  12  aspects  (color  coded)  generated  by  ART-2,  (b)  centroids  of 
the  largest  regions  of  the  12  aspects,  or  categories,  and  (c)  corresponding  example  silhouettes  of  the  regions  in 


part  b.  These  views  have  been  selected  automatically 
sphere  have  been  selected  arbitrarily;  i.e.,  a  dark  blue 

daily  in  the  field  after  the  aspect  categorization  has 
stabilized  (i.e.,  after  repeated  exposures  to  the  training 
data  yield  the  same  categorization).  Then,  during  the 
imaging  of  a  target  in  morion,  multiple  viewpoints 
are  experienced,  leading  to  recognition  of  multiple 
aspects  by  the  ART- 2  network,  followed  bv  evidence 
accumulation  by  the  object  nodes  in  the  Aspect  net- 


system.  (Note:  The  12  colors  used  tor  the  aspect 
no  relation  to  a  light  blue.) 

work.  Target  trajectories  are  realized  as  a  set  of  aspect 
categories  linked  together  by  aspect  transitions.  Much 
more  information  becomes  available  when  we  con¬ 
sider  the  aspect  transitions  among  ambiguous  view's. 
Tor  example,  even  if  both  views  of  a  two-aspect  se¬ 
quence  are  each  ambiguous  among  potential  targets, 
the  additional  aspect-transition  information  is  often 


88 


*  WAXMAN  1 1  AL. 

Neural  Systems  for  Automata  I arget  Learning  and  Recognition 


sufficient  tor  the  preferential  activation  of  the  correct 
target  node  in  the  Aspect  network. 

Figure  12  illustrates  an  Aspect  network  tor  a  single 
object,  along  with  an  enlarged  view  of  the  networks 
adaptive axo-axo-dendritic synapse.  This  synapse  brings 
together  in  close  physical  proximity  projections  from 
pairs  of  aspect  nodes  onto  a  branch  of  the  dendritic 
tree  leading  to  an  object  node.  When  ART- 2  catego¬ 
ries  are  excited  in  temporal  succession,  the  aspect 
nodes  shown  charge  or  discharge  exponentially  like 
capacitors,  and  their  temporal  overlap  of  activity  sup¬ 
ports  a  Hebbian  form  of  correlational  learning  on  the 
connecting  synapse  (cf.  Reference  13  for  a  discussion 
of  modified  Hebbian  learning  with  gated  decay).  The 


synaptic  weights  lie  in  the  interval  10,1],  and,  as  cat¬ 
egory  transitions  are  experienced,  the  weights  asymp¬ 
totically  approach  the  extreme  values  of  0  (implying 
no  allowed  transition  between  corresponding  catego¬ 
ries)  and  1  (indicating  a  permitted  transition).  These 
values  correspond  to  the  absence  or  presence  of  an  arc 
in  the  associated  aspect-graph  representation.  The  den¬ 
dritic  tree  with  its  synaptic  connections  resembles  the 
symmetric  state-transition  matrices  that  are  commonly 
used  in  system  modeling  techniques. 

Extending  the  Aspect-network  concept  to  multiple 
targets  leads  to  the  network  architecture  shown  in 
Figure  13.  In  this  design  we  consider  all  aspect  cat¬ 
egories  of  all  targets  as  belonging  to  the  same  ART- 2 


FIGURE  11.  Aspect  spheres  for  the  (a)  F-18,  (b)  F-16,  and  (c)  HK-1  (Spruce  Goose)  have  been  generated  from  535,  530,  and 
423  views,  respectively,  of  each  aircraft.  Feature  extraction,  invariant  mappings,  and  ART-2  categorization  of  all  1488  views 
generate  a  total  of  41  aspects,  or  categories,  at  a  vigilance  p  of  0.93.  The  number  of  categories  generated  for  the  individual 
aircraft  is  26  for  the  F-18, 24  for  the  F-16,  and  28  for  the  HK-1.  Note  that  many  categories  are  common  to  more  than  one  target; 
i.e.,  the  light  yellow  in  part  a  corresponds  to  the  same  category  that  is  represented  by  the  same  light  yellow  in  part  b.  Also 
note  the  resemblance  of  the  aspect  spheres  for  the  two  fighter  aircraft,  in  contrast  to  the  HK-1  aspect  sphere. 


•  WAXMAN  I  I  M 

. \t  tlhj /  Sj '/(  ///.'  ju!  Alllnltlflllt  l  /  Ilirtll'i''  /\<  ,  <rnt:!h>N 


FIGURE  12.  Aspect  network  for  the  single-object  case:  (a)  network  and  (b)  enlarged  view  of  one  synapse  of  the  network.  The 
aspect  nodes  (blue)  are  each  coupled  to  corresponding  categories  allocated  by  the  ART-2  network;  the  nodes  charge  and 
decay  like  capacitors.  Axons  (wires)  emanating  from  each  aspect  node  cross  each  other  to  form  a  transition  matrix,  and 
each  crossing  has  an  associated  axo-axo-dendntic  synapse  (red)  onto  the  dendritic  tree  (orange)  of  the  object  node.  When 
two  aspect  nodes  are  simultaneously  active  (during  view  transitions),  they  strengthen  the  synapse  (red)  via  modified 
Hebbian  learning,  and  conduct  activity  onto  the  dendrite  toward  the  object  node.  Object  nodes  thus  pool  activity  from 
aspect  nodes,  exploiting  transition  information  to  amplify  this  activity,  thereby  accumulating  evidence  over  time,  in  the 
enlarged  view,  the  synapse  brings  together  activity  from  aspect  nodes  X,  and  Xt  (as  well  as  a  background-noise  level  > )  and 
channels  it  onto  the  dendritic  tree.  (Note:  The  box  "Aspect  Network  Learning  Dynamics'"  contains  a  description  of  the 
equations  that  govern  the  aspect  nodes,  object  nodes,  and  synaptic  weights.  For  further  details  of  Hebbian  learning  and 
Aspect  networks,  see  References  10, 11,  and  13.) 


network.  1  lie  aspect  categories  of  the  ART  -2  network 
drive  a  single  set  of  aspect  nodes  that  fan  out  to  all  the 
synaptic  arrays  ol  possible  targets.  Activity  (i.c.,  evi¬ 
dence)  is  then  channeled  into  the  object  nodes,  which 
compete  to  select  the  target  with  the  maximum  evi¬ 
dence  at  that  moment.  I  he  winning  object  is  then 
able  to  modify  its  own  transition  array.  Sudden  sac¬ 
cadic  eve/camera  motions  to  other  locations  in  a  scene 
initiate  a  reset  of  object-node  activities  to  zero;  smooth 
tracking  motions  do  not  cause  such  resetting. 

I'igure  14  contains  an  example  of  aircraft  recogni¬ 
tion  by  the  Aspect  network.  In  the  training  sequence, 
each  of  the  three  model  aircraft  experiences  an  identi¬ 
cal  trajectory  of  2000  views  covering  one  quadrant  of 
the  viewing  sphere.  Then  a  test  sequence  of  SO  f  -16 
images  is  generated,  and  evidence  is  accumulated  for 
each  of  the  three  targets  as  well  as  for  an  unlearned 
other  target  representing  a  none-of-the-above  category. 
T  he  graphs  shown  in  the  figure  illustrate  the  corre¬ 
sponding  category  (and  transition)  sequence,  the  evi¬ 


dence  accumulation  and  decay  tor  each  possible  tar¬ 
get,  and  the  winning  object  with  the  instantaneous 
maximum  evidence.  Note  that  initially  the  system 
begins  selecting  the  “other  target  until  sufficient  evi¬ 
dence  accumulates  to  declare  the  116  the  winner, 
and  it  remains  so.  Reference  1  I  contains  further  de¬ 
tails  of  this  experiment. 

At  this  point  we  have  the  basic  design  of  a  neural 
AIR  system.  I  he  system  has  a  number  of  definite 
strengths,  but  it  also  suffers  from  a  few  shortcomings, 
for  example,  a  difficulty  exists  in  adding  new  targets 
once  the  system  has  stabilized,  because  new  data  may 
modify  the  existing  ARI-2  category  templates  and 
lead  to  tlie  need  to  retrain  the  Aspect  network.  A 
more  efficient  design  is  to  assign  a  separate  ARI-2 
network  and  (much  compressed)  Aspect  network  to 
each  potential  target,  but  allow  the  unsupervised  as¬ 
signment  of  aspect  categories  during  the  controlled 
exposure  in  a  training  session.  Bv  doing  so,  we  can 
add  new  targets  at  a  later  time  by  simply  adding  new 


90 


•  WAXMAN  F  I  AL. 

.W'urut  Systems  for  Automatic  l argtl  l  earning  and  Recognition 


ASPECT  NETWORK  LEARNING  DYNAMICS 


we  developed  the  Aspect  network 
(Figures  12  and  13)  as  a  means  to 
fuse  recognition  events  over  time. 
The  network  embodies  a  hierar¬ 
chical  pooling  ol  view-specific  as¬ 
pect  categories  so  as  to  exploit  the 
additional  information  associat¬ 
ed  with  permitted  category  tran¬ 
sitions.  These  transitions  are 
learned  by  exploring  the  object. 

The  dynamics  of  Aspect  net¬ 
works  is  in  the  form  of  differen¬ 
tial  equations  (shown  below)  gov¬ 
erning  the  short-term  memory 
activity  of  the  aspect  nodes  Xt  and 
object  nodes  Yk,  and  long-term 
memory  of  the  adaptive  axo-axo- 
dendritic  synapses  W:f.  Aspect 
nodes  are  excited  by  tbeir  corre¬ 
sponding  ART-2  category  nodes 
I,  (with  rate  constant  Kx)  and  pas¬ 
sively  decay  back  to  their  resting 


state  (with  rate  constant  Ay). 

Object  nodes  accumulate  evi¬ 
dence  for  each  object  by  sum¬ 
ming  the  activity  (with  rate  con¬ 
stant  Ky)  entering  from  the  aspect 
nodes  on  the  dendritic  tree.  Ac¬ 
tivity  riding  atop  background 
noise  e  enters  via  the  learned  syn¬ 
apses  corresponding  to  permitted 
transitions,  and  activity  is  chan¬ 
neled  most  effectively  by  paired 
aspect  nodes  in  a  permitted  se¬ 
quence.  The  function  d>#(/4)  is  a 
threshold  linear  function  that 
passes  activity  levels  when  A  > 
O(B).  Similar  to  the  aspect  nodes, 
the  object  nodes  also  decay  pas¬ 
sively  to  their  resting  state  (with 
rate  constant  ky). 

The  synaptic  weights  learn  as¬ 
pect  transitions  by  experiencing 
correlated  activity  from  two  as¬ 


Aspect  nodes :  — 

P  dt 


=  KXI,  -  kxX, 


pect  nodes,  as  long  as  the  object 
node  activity  is  changing  (i.e., 
Y \  *  0)  for  the  winning  object 
Zfr  The  function  0,(0  is  a  bi¬ 
nary  threshold  gate  that  equals 
unit)'  when  C>  O(f).  The  weights 
approach  asymptotes  toward  the 
fixed  points  of  0  and  1  because  of 
the  quadratic  shunting  terms  that 
modulate  the  rate  constant  A'u . 
For  further  details  of  Aspect  net¬ 
works,  see  References  1  and  2. 

Reference s 

1 .  M.  Seibert  and  A.M.  Waxman,  “Learn¬ 
ing  and  Recognizing  31)  Objects  from 
Multiple  Views  in  a  Neural  System," 
chap.  11.12  in  Neural  Networks  for  Per¬ 
ception,  Vol.  I.  cd.  H.  Wechslcr  (Aca¬ 
demic  Press,  New  York,  1 99 1 ),  pp.  426- 
444. 

2.  M.  Seibert  and  A.M.  Waxman,  “Adap¬ 
tive  3-D  Object  Recognition  from  Mul¬ 
tiple  Views,”  I  REF  Tram.  Pattern  Anal. 
Mach.  Intel!.  14,  107(1992). 


Object  nodes : 


Synaptic  weights: 


=  AV  2  ^  [(*'  +  e)Wi(XJ  +  f)]  -  ' 

'  J>‘ 

=  KwW*(  1  -  <){«>,  [(*,  +  +  e)]  -  Ar}©,  (Yk)&,  (Z,I 


ART- 2  and  Aspect  networks,  without  anv  modifica¬ 
tion  to  the  existing  networks.  Moreover,  separate  ART- 
2  networks  for  each  target  better  support  the  A  I  R 
task  given  only  a  single  view  (as  opposed  to  a  se¬ 
quence  of  views),  because  each  target  will  have  gener¬ 
ated  its  own  set  of  learned  templates  within  its  AR  1-2 
module.  This  design  has  been  adopted  for  the  next 
application — target  recognition  from  SAR  spot¬ 
light  sequences.  For  this  application  we  also  intro¬ 


duce  a  measure  of  recognition  confidence  derived 
from  the  accumulated  evidence. 

Tactical  Target  Recognition  in  the  Synthetic- 
Aperture  Radar  (SAR)  Spotlight  Mode 

High-resolution  radar  imaging  of  a  scene  can  be  ac¬ 
complished  by  living  a  radar  that  is  transmitting  chirp 
pulses  from  many  closelv  spaced  look  angles  (Figure 
IS).  1  he  moving  radar  thus  synthesizes  a  long  aper- 


•  WAXMAN  1 l  Al . 

.\turul  Syyh-h.-y  for  Anttnnum  f  argil  /  i\inang  uud  RnugmtO’u 


t u rc ,  and  the  return  pulses  determine  a  reflectivity 
image  of  the  scene  as  projected  into  the  range  and 
cross-range  coordinates  of  the  plane  formed  he  the 
synthetic  aperture  and  the  radar  line  of  sight.  (This 
plane  is  referred  to  as  the  synthetic-aperture  radar  [SARI 
slant  plane.)  The  range  resolution  is  proportional  to 
the  bandwidth  of  the  chirp  pulse;  the  cross-range 
resolution  is  proportional  to  the  angle  subtended  by 
the  synthetic  aperture.  As  the  radar  moves  along  the 
flight  path,  it  can  be  “squinted’’  so  as  to  track  a  fixed 
location  on  the  ground.  Hence,  the  radar  beam 
spotlights  a  particular  scene,  and  a  sequence  of  SAR 
images  is  obtained  of  that  scene  from  multiple 
views. 


1  he  reader  mat  look  ahead  to  f  igure  26(a)  to 
view  a  typical  clutter  scene — an  overpass  that  crosses 
the  New  York  State  I  hruwav — obtained  from  the 
Lincoln  Laboratory  Advanced  Detection  icchnologv 
Sensor  (AD  I  S),  a  millimeter-wave  radar,  operating  in 
the  SAR  mode.  (In  our  work,  only  simile-channel 
vertical-vertical  [VV|  polarization  imagers  is  used.) 
Note  that  the  image  quite  speckled,  a  consequence 
of  the  coherent  imaging  method.  Nonetheless,  at  first 
glance  this  scene  has  a  rather  natural  appearance. 

lo  illustrate  here  the  appearance  of  objects  such  as 
ground  vehicles,  we  refer  to  the  inverse  SAR,  or  ISAR. 
images  shown  in  Figure  16.  1  hrec  tactical  targets  are 
shown  at  a  radar  depression  angle  (or  slant-plane 


FIGURE  13.  Aspect  network  for  the  multi-object  case.  Input  aspect  categories  from  a  single  ART-2  network  (coding  all 
aspects  of  all  targets)  excite  aspect  nodes  that  fan  out  to  all  synaptic  arrays  of  learned  view  transitions,  each  of  which 
conducts  activity  (i.e.,  evidence)  to  its  corresponding  object  node.  A  competition  layer  (created  from  self-excitation  and 
collective  inhibition)  determines  the  target  of  maximum  evidence  at  any  moment,  and  allows  the  corresponding  synaptic 
array  to  be  refined.  Sudden  eye/camera  motions  can  cause  the  object  nodes  to  reset  their  evidence  to  zero.  (For  a  detailed 
description  of  Aspect  networks,  see  References  10  and  11.) 


Hif  DNCOi'.  :SB(1RAT0»-  J0';R\A!  VOluVF*  N'lVBtR  ' 


92 


•  WAXMAN  ET  AL. 

Seurat  Systems  fur  Automatic  I arget  Learning  and  Recognition 


FIGURE  14.  Example  of  training  and  recognition  by  evidence  accumulation:  (a)  view  sphere  showing  the  trajectory 
from  which  2000  views  of  each  aircraft  were  used  for  training  the  system,  (b)  view  sphere  showing  the  trajectory  from 
which  50  views  of  an  F-16  were  selected  for  testing  the  system,  and  (c)  graphs  showing  the  recognition  test  results.  In 
part  c,  the  first  graph  plots  the  sequence  of  aspects  that  were  recognized  by  the  system  (note  the  transitions).  The 
second  graph  shows  the  activity  (i.e.,  evidence)  of  the  aspect  node  for  each  aircraft  target,  including  an  unlearned 
target  (referred  to  as  “other'’).  And  the  final  graph  shows  the  “winning  object,"  or  target  of  maximum  evidence  at  each 
moment.  Note  that  the  system  first  declares  the  target  as  "other,"  but  then  generates  sufficient  evidence  to  declare  it 
correctly  as  an  F-16,  and  that  correct  recognition  response  is  maintained. 


slope)  of  15°  and  three  azimuthal  angles  correspond¬ 
ing  to  front-on,  intermediate,  and  broadside  views. 
The  images  were  obtained  by  rotating  each  target  on  a 
turntable  in  front  of  a  stationary  radar.  Unlike  with 
Figure  26(a),  the  man-made  metallic  objects  in  Figure 
16  do  not  yield  radar  images  that  resemble  their 
visible  counterparts.  The  ISAR  images  are  dominated 
by  strong  returns  from  select  scattering  centers  on  the 
target,  sidelobe  responses,  and  speckle  noise.  Both 
Figures  16  and  26(a)  possess  1  -ft  resolution  in  range 
(oriented  vertically)  and  cross-range  (oriented  hori¬ 
zontally),  with  the  near-range  (closest  to  the  radar) 
located  at  the  top  of  the  image. 

To  build  an  ATR  system  that  exploits  spotlight¬ 


mode  SAR  sequences,  we  can  utilize  many  of  the 
ideas  and  neural  modules  developed  for  the  visible 
imaging  domain,  as  presented  in  the  preceding  sec¬ 
tion.  The  different  sensing  modality  of  radar,  how¬ 
ever,  provides  us  with  direct  range  and  cross-range 
information,  and  hence  object  size,  which  can  be 
exploited  in  the  grouping  process  that  is  used  to 
detect  potential  targets.  On  the  other  hand,  our  ear¬ 
lier  methods  of  invariant  processing  must  be  altered. 
In  particular,  the  log-polar  transform  must  be  dis¬ 
carded  because  the  slant-plane  image  is  not  an  angle- 
angle  image  (as  is  obtained  in  passive  visible  or  infra¬ 
red  imaging). 

Borrowing  heavily  from  our  work  in  the  passive 


Vfli  : 


93 


FIGURE  15.  Imaging  geometry  for  spotlight-mode  synthetic-aperture  racial  (SAR).  A  racial  on  boaid  an  ancraft  illuminates 
an  area  of  interest  nn  the  ground  by  pointing  at  a  depression  angle  hand  squint  angle  «,i  As  the  ancaft  flies  along  a  stiaight 
path  at  altitude  />.  the  radai  transmits  chirp  pulses  from  many  closely  spaced  look  angles,  and  the  return  pulses  detei  mine  a 
lefleetivity  image  of  the  ground  patch  and  obiects  of  interest.  Progressing  along  the  flight  path,  the  radar  is  steered  to 
illuminate  the  same  area  of  interest,  and  thus  obtains  a  sequence  of  SAR  images  from  multiple  look  angles 


visible  domain,  the  conceptual  approach  to  SAR  tar¬ 
get  learning  and  recognition  is  summari/cd  in  1  ipurc 
I  (compare  to  kipurc  Ah  Ap.iin,  each  imape  of  the 
spotlight  seoueiue  is  processed  through  three  st.,pes. 
I  he  Inst  stape  extracts  features,  detects  potential  tar- 
nets  In  proupinp  the  features,  and  estimates  the  orien¬ 
tation  of  each  potential  target.  I  he  second  stape  is 
attain  'cspotisiblc  lor  the  adaptive  c.uepori/aiion  of 
feature  patterns  into  aspects,  or  categories  (leadinp  to 
an  aspect -sphere  representation  of  the  tarpetsb  And 
the  third  stape  detects  aspect  transitions  (analopotis 
to  the  arcs  of  a  corresponding  aspect  prapli),  ac¬ 
cumulates  evidence  met  time,  and  pcncratcs  a  recog¬ 
nition  decision  as  well  as  a  dynamic  confidence 
measure. 

I  inure  I  S  show s  the  end  to  end  neural  system  that 
we  lu\e  developed.  \  quick  inspection  of  the  mod¬ 
ules.  w  hie  h  are  orpam/cd  into  three  row  s  represent  inn 
the  tin cc-  stance  of  processing  of  I -inure  I-,  reveals 
main  of  the  same  neural  networks  that  were  used  for 
the  p.tssicc  cisihlc'  domain.  It  is  also  ccidcnt  that  we 


have  learned  from  out  work  in  that  area:  wc  now 
utilize  a  separate  AR  I  2  network  lot  each  tarpet.  and 
a  separate  Aspect  network  connected  to  each  of  these 
A R  I  - 2  networks.  I  he  following  flputes  illustrate  tlu 
various  processmp  modules  shown  m  the  system 
diagram. 

In tell  SAR  imape  of  the  spotlight  sequence  is  pro 
cessed  In  the  entire  chain  of  neural  modules.  I  here 
are,  however,  several  opportunities  to  exploit  the  tem¬ 
poral  flow  of  information  inherent  m  the  processed 
data.  1  he  very  first  module  uses  shuntmp  centci 
surround  networks,  either  m  isolation  or  .is  pan  of 
the  Boundary  C  ontour  System  (IU  Si  and  I  eature 
C  oiitour  System  ll(  SI  networks  tor  imape  condi 
tioninp.  ( IU  S/I  (  S  networks  are  discussed  in  the  lol 
low  i np  section,  "SAR  Imape  C  ondit ion  i  up  l  bmp  IU  s 
1(  S  Networks.  I  I  ipurc  I ')  prov  ides  some  details  on 
shuntmp  center-surround  networks,  as  applied  to  S  \R 
imaperv  tor  feature  extraction.  In  this  application,  the 
shuntmp  center  sin  round  network  converts  the  slant 
plane  reflectance  imauc  into  a  loc.tllv  normalized  con 


■f  I 


FIGURE  16.  Examples  of  inverse  SAR.  or  ISAR,  imagery  of  three  tactical  ground  vehicles:  (a)  target  1 .  (b)  target  2.  and  (r ) 
target  3.  The  three  targets  are  shown  at  three  different  orientations:  the  left,  middle,  and  right  columns  of  images  are  foi 
azimuth  angles  of  0  (front-on  view),  45  (intermediate  view),  and  90  (broadside  view),  respectively.  The  images  are  foi  a 
racial  depression  angle  of  15  and  vertical-vertical  (VV)  polarization. 


I  mm  imapc  I  hrcsholds  arc  then  applied  to  the  valia¬ 
nt  c.ah  pixel  ol  the  haalK  normalized  contrast  im,ti;c. 
and  an  \\|)  operation  a  used  to  combine  the  result 
ins;  linage  with  a  low -threshold  version  ol  the  lot; 
li  lies  lata e  input  linage  to  obtain  a  set  ol  hisa.h  ion 
Hast  feature  blobs  that  i  an  then  be  projected  born  the 
slant  plane  to  tin  sgound  plane  In  usinp,  the  known 
rad. u  mi. it'in;;  I’.eonietn. 


in  the  lovallv  normalized  contrast  ininuc.  the  local 
contrast  is  dependent  on  the  choice  of  spatial  scales 
for  the  exc  union  center  and  inhibitors  surround  at 
eas  ol  the  receptive  held,  as  show  n  in  Hj;uie  1 l).  1  lie  sc1 
scales  are  chosen  so  as  to  capture  tile-  texture  ol  scat 
terms;  centers  on  a  \ chicle  as  compared  to  the  \chu  ii 
as  a  whole.  I  \ote:  I  he  network  does  um  ti  \  to  del  ex  t 
bris'Jit  pixels  on  the  target  as  compated  with  th,  sui 


•WAXMANETAE. 

Neural  Systems  for  Automatic  Target  Learning  and  Recognition 


rounding  clutter,  as  is  typical  of  constant  false-alarm 
rate  [CFAR]  filtering  methods.)  Another  advantage 
of  using  shunting  networks  here  is  that  they  perform 
an  automatic  gain-control  operation,  and,  as  a  result, 
the  large  dynamic  range  of  radar  reflecrances  collapses 
into  a  predefined  range  in  a  locally  adaptive  fashion. 
These  networks  are  modeled  as  dynamic  membranes 
[  1 3]  and  resemble  bipolar  and  ganglion  receptive  fields 
in  the  retina. 

Figure  20  illustrates  the  four  steps  involved  in  pro¬ 
cessing  an  ISAR  image  that  contains  four  targets.  The 
input  image  is  shown  in  the  upper  left  quadrant,  and 
the  result  of  feature-blob  extraction  is  shown  in  the 
upper  right.  The  spatial  patterns  of  the  extracted  fea¬ 
ture  blobs  show  strong  resemblance  to  the  scattering 
patterns  obtained  from  SARTOOL  simulations  of 
radar  imagery.  (SARTOOL  decomposes  a  target  ob¬ 
ject  into  its  principal  scatterers  and  then  combines  the 
radar  signatures  of  thoae  scattercrj.)  Note  that  we 


have  discarded  the  original  reflectance  values  of  the 
feature  blobs  because,  in  practice,  they  can  vary 
considerably  from  one  instance  of  a  target  to  another. 
In  the  more  realistic  case  of  targets  in  clutter,  feature 
blobs  generated  by  nonrargets  will  also  be  extracted 
from  the  clutter.  Thus,  to  simulate  clutter,  we  added 
2%  random  noise  to  the  feature-blob  image  before 
proceeding  with  the  processing.  (In  Figure  20  the 
feature-blob  image  is  shown  without  the  superimpo¬ 
sition  of  any  noise  so  that  we  could  illustrate  clearly 
the  target  feature  blobs  that  emerge  from  the  extrac¬ 
tion  process.) 

Because  the  image  axes  are  measured  in  units  of 
physical  size,  we  can  use  the  images  directly  to  detect 
potential  targets  and  discriminate  them  from  clutter 
and  nontarget  objects  by  grouping  the  feature  blobs 
into  clusters  of  approximately  the  same  image  size  as 
the  targets  of  interest.  This  grouping  is  performed  in 
the  ground-plane  coordinates,  first  by  using  an  iso- 


2-D  view 
processing 


2-D  view 
categorization 


(Id 


3-D  target 
hypotheses 


•  Feature 

•  Pattern 

•  T  ransition 

extraction 

encoding 

detection 

Spotlight 

•  Target 

•  Aspect 

•  T  ransition 
learning 

sequence 

detection 

learning 

and 

•  Orientation 
estimation 

recognition 

•  Evidence 
accumulation 

*.  Target 
recognition 


FIGURE  17.  Conceptual  approach  of  ATR  neural  system  for  SAR  image  sequences:  (a)  spotlight  sequence  of  SAR  images 
and  corresponding  aspect  sphere  and  aspect  graph,  and  (b)  functional  block  diagram  of  system.  Note  that  the  approach  is 
analogous  to  the  approach  for  passive  visual  imagery  shown  in  Figure  3. 


Detection  and  discrimination 


Aspect 

sequences  Spotlight  evidence  accumulation 


FIGURE  18.  Modular  system  architecture  for  the  learning  and  recognition  of  3-D  targets  from  SAR  imagery.  Hip  three  rows 
of  modules  represent  the  three  stages  of  processing  shown  in  Figure  17.  Each  individual  module  is  a  neural  network  that 
transforms  the  imagery  as  indicated.  From  a  sequence  of  SAR  images  the  recognized  targets  generate  a  dynamic  measuie 
of  confidence. 


Normalized 
contrast  image 


Input  SAR  image 


FIGURE  19.  Shunting  short-term  memory  model  for  feature  extraction  from  SAR  imagery:  (a)  center-surround  feedforward 
architecture  and  (b)  center-surround  receptive  field.  The  model  is  implemented  as  a  feedforward  dynamical  system  with  an 
excitatory  ccnter/inhibitory-surround  receptive  field.  In  equilibrium  the  resulting  image  represents  locally  normalized 
contrast.  The  scales  of  the  receptive  field — 5  *  5  for  the  center  region  and  21  x  21  for  the  surround  region — are  chosen  to 
capture  the  contrast  between  scatterers  and  target  objects.  (Note:  A  description  of  the  equations  that  govern  shunting 
short-term  memory  and  the  equilibrium  condition  are  given  in  the  box  “Shunting  Short-Term  Memory"  on  page  98.  For 
further  details,  see  Reference  13.) 


•  WAXMAN  H  I  AL. 

Neural  System*  for  Automatic  larger  l  earning  and  Recognition 


SHUNTING  SHORT-TERM 


MEMORY 


the  full  dynamic  range  radar 
image  /  serves  as  input  to  a  shunt¬ 
ing  short-term  memory  (STM) 
network  (Figure  19),  as  governed 
by  the  dynamics  of  a  charging 
membrane  (essentially  Ohm’s 
law).  Excitatory  input  from  a 
Gaussian  center  C,  shunted  in¬ 


hibitory  input  from  a  Gaussian 
surround  S,  and  passive  decay 
(with  rate  constant  A^)  yield  an 
equilibrium  contrast  measure  At 
that  is  normalized  with  respect  to 
the  local  mean  amplitude.  The 
Gaussian  center  is  weighted  by 
Gj*r ,  and  the  Gaussian  surround 


is  weighted  by  G%‘ .  More  gener¬ 
al  shunting  networks  are  described 
in  Reference  1. 

Reference 

1.  S.  Grossberg,  “Nonlinear  Neural 
Networks:  Principles,  Mechanisms, 
and  Architectures."  Neural  Network’s 
1,  17  (1988). 


Activity  dynamics:  — =  -A AAi  +  ^  6”''  /^  -  (1  +  /1,)^T  Gy*'/* 

dt  7  * 


Equilibrium  contrast: 


A 


2  gPj  -  2 G] 


ik  h 


xa  + 


c  -s 

XA  +  S 


tropic  receptive  field  (shown  as  circular  areas  in  the 
lower  left  quadrant  of  Figure  20),  and  then  by  using 
oriented  rectangles  in  the  vicinity  of  the  isotropic 
groupings,  l  he  rectangles  are  constructed  from  in- 
hibitory-center/excitatorv-surround  receptive  fields, 
motivated  by  the  scatterer  distributions  that  are  typi¬ 
cal  of  the  targets  of  interest.  Given  a  view  sequence 
in  which  targets  may  be  considered  stationary  as  com¬ 
pared  to  the  moving  and  squinting  radar.  Adaptive 
Linear  Neurons  (ADAFINES)  [18]  performing  a  re¬ 
cursive  least-squares  estimation  from  the  measure¬ 
ments  can  be  used  to  estimate  and  refine  the  target 
locations  and  orientations.  After  a  target  has  been 
detected  and  localized,  it  can  then  be  segmented  from 
the  scene,  rotated  into  a  reference  frame  aligned  with 
the  target  (with  an  ambiguity  between  whether  the 
target  is  facing  forward  or  backward),  and  processed 
bv  the  remaining  modules.  I  he  oriented  feature  blobs 
for  each  detected  target  are  shown  in  the  lower  right 
quadrant  of  Figure  20.  Note  that  sidelobe  responses 
outside  the  targets  have  been  discarded. 

Figure  21  illustrates  the  processing  of  ail  individual 


target.  The  input  slant-plane  imagery  is  shown  in  the 
upper  left  quadrant  of  the  figure,  and  the  localized 
target  feature  blobs  are  shown  in  the  upper  right. 
After  the  features  are  reoriented  in  a  frame  of  refer¬ 
ence  with  respect  to  the  target,  a  DEB  network  is  used 
to  reduce  the  features  to  points,  as  shown  in  the  lower 
left  quadrant.  This  oriented  spatial  pattern  of  feature 
points  covers  an  extent  of  approximately  20  x  30 
pixels  at  1  -ft  resolution.  Moreover,  as  the  target  orien¬ 
tation  and  radar  depression  angle  change,  this  pattern 
must  change  quickly,  too.  Of  equal  importance  is  the 
deformation  of  this  feature  pattern  with  varying  radar 
squint  angle  (given  an  identical  depression  angle  and 
target  orientation  with  respect  to  the  radar).  For  these 
reasons,  a  template  constructed  from  this  feature  pat¬ 
tern  (no  less  a  template  that  incorporates  the  original 
reflectance  values)  is  both  memory  intensive  and 
fraught  with  difficulties.  Thus  we  can  again  utilize  the 
large  overlapping  receptive  fields  (cf.  Figure  7)  to 
reduce  this  binary  feature  pattern  to  a  9  x  l)  array  of 
analog  numbers  that  code  the  spatial  distribution  of 
features  in  a  compressed  nvmner  that  F  robust  to 


98 


I 


FIGURE  20.  Multitarget  localization  and  orientation  estimation  is  illustrated  foi  the  c  ase  of  four  tactical  targets.  The  upper 
left  rprarliant  shows  the  input  slant  plane  imagery  foi  the  composite  of  four  targets  at  a  depression  angle  of  15  (Note  In 
eai  h  of  the  four  goad i ants,  the  four  targets  are  ar  ranged  with  tai got  1  m  the  upper  loft,  tai get  2  m  the  upper  tight,  target  T  m 
the  lower  left,  and  a  modified  version  of  target  1  in  the  loe.or  r  ight.)  The  upper  i  iglit  guadi ant  contains  the  featuie  hlohs  that 
have  been  extracted  and  projected  m  the  ground  |ilano  Each  target  is  then  giouped  with  a  <  in  rilat  mask,  and  the  target's 
orientation  is  estimated  with  air  inhibitory  renter  ext  itatoiy  sin  r  ounrl  or  tented  lectangulai  mask,  as  show  n  m  the  lower  left 
guadiant.  This  |)ior  essmg  allows  the  deter  ted  taigets  to  hr’  lennented  m  a  tauiet  frame  of  reference,  as  lias  been  done  o 
the  lower  r  irjlit  grind  rani.  Ther  oloi  hat  at  thr>  bottom  of  the  figuie  denotes  mi  teasing  (from  hl.ii  k  to  .'.lute)  toilet  net,  far  t»i<- 
i'V,  li  !•••  ,  111  the  Lrpirer  left  giinrlrnnt 

<)•> 


•  WAXMAN  ET  AL. 

Neural  Systems  for  Automatic  target  Learning  and  Recognition 


spatial  deformation.  Such  a  coding  is  illustrated  in  the 
lower  right  quadrant  of  Figure  21. 

The  9  x  9  array  of  81  numbers  lorms  the  input  to 
an  ART- 2  network  that  is  dedicated  to  the  learning  of 
a  particular  target.  The  target  is  learned  during  a 
training  session  in  which  the  target  exposure  is  con¬ 
trolled.  In  a  testing  session,  the  system  is  run  in 
recognition  mode,  and  the  81 -member  spatial  code 
vector  is  fed  to  all  ART-2  networks  representing  all 
targets  of  interest.  Figure  22  illustrates  the  results  of 
training  independent  ART-?  nerworks  on  each  of  the 
three  1SAR  targets  of  Figure  16.  The  resulting  aspect 
categories  that  were  established  are  shown  color  coded 
on  aspect  spheres  seen  both  from  the  side  with  the 
targets  facing  left,  and  from  above  (compare  to  Figure 


FIGURE  21.  Target  feature  extraction  and  spatial  coding 
for  a  single  target.  The  upper  left  quadrant  shows  the 
input  slant-plane  image  of  a  target  at  a  depression  angle  of 
15°.  In  the  upper  right  quadrant,  feature  blobs  have  been 
extracted  from  the  input  image  and  projected  in  the  ground 
plane.  After  the  feature  blobs  are  reoriented  in  the  target 
frame  of  reference,  a  DEB  network  is  used  to  reduce  the 
blobs  to  points,  as  shown  in  the  lower  left  quadrant.  The 
feature  points  are  then  coarsely  coded  by  a  9  x  9  array  of 
overlapping  receptive  fields  (cf.  Figure  7),  as  shown  in  the 
lower  right  quadrant.  The  color  bar  at  the  bottom  of  the 
figure  denotes  increasing  (from  black  to  white)  reflectivity 
for  the  upper  left  image,  and  increasing  receptive-field 
activity  for  the  lower  right  image. 


1 1).  The  data  consisted  of  1SAR  images  created  at  all 
even  azimuths  360°  around  each  target,  for  radar 
depression  angles  between  1 5°  and  32°,  comprising 
approximately  3000  views  of  each  target.  (Note  the 
missing  data  at  a  few  intermediate  depression  angles 
for  targets  2  and  3.)  The  resulting  unsupervised  classi¬ 
fication  generated  islands  of  common  category  that 
extended  over  large  azimuthal  extents  and  many  de¬ 
pression  angles.  We  purposely  changed  the  vigilance 
parameter  setting  berween  targets  to  emphasize  the 
user’s  control  over  the  fineness  of  categorization.  With 
a  finer  categorization,  more  details  survive  the  learn¬ 
ing  process.  The  roughly  3000  views  of  each  target 
have  been  compressed  into  only  34  categories  for 
targets  1  and  2,  and  75  categories  for  target  3,  for 
which  we  used  the  highest  vigilance  setting. 

Associated  with  each  category  allocated  by  each 
ART-2  network  is  a  template  of  the  prototype  9  x  9 
array  (contrast  enhanced  and  normalized)  that  was 
learned  by  the  synapses  (the  adaptive  LTM  sites)  in 
the  network,  as  shown  in  Figure  8.  Eight  of  the  prin¬ 
cipal  templates  for  target  1  are  illustrated  in  Figure 
23,  along  with  sample  slant-plane  images  from  the 
corresponding  viewing  directions.  Note  that  the 
learned  templates  include  two  broadside  views,  fron¬ 
tal  and  end-on  views,  and  the  characteristic  L-shapes 
near  the  four  corner  views.  These  color  patterns  code 
prototype  spatial  feature  patterns  (not  reflectance 
patterns). 

To  complete  the  learned  description  of  each  target, 
the  permitted  transitions  among  aspect  categories  must 
be  detected  and  imposed  on  the  synapses  of  each 
Aspect  network.  The  result  of  this  process  is  con¬ 
tained  in  the  last  row  of  photographs  in  Figure  22. 
The  photographs  show  the  transition  matrices  for 
each  target  (cf.  Figure  13).  In  the  matrices,  a  red  pixel 
corresponds  to  a  permitted  transition  while  a  green 
pixel  codes  the  absence  of  such  a  transition. 

Figure  24  contains  an  example  of  our  ATR  system 
running  in  recognition  mode.  We  used  an  ISAR  im¬ 
age  sequence  that  consisted  of  45  views  of  target  1 
(only  the  odd  azimuths  in  the  interval  67°  to  1 57°)  at 
a  depression  angle  of  21°.  (Note:  Although  this  dataset 
was  not  part  of  the  original  training  set,  it  was  admit¬ 
tedly  not  very  different  from  the  training  data.  This 
lack  of  adequate  training  and  test  datasets  is  a  prob- 


100 


A  jn  ,(?\Al 


vOi  :iV|  h 


Rf*  ’ 


w  \\m  mi  \i . 


(a)  (b)  (c) 


FIGURE  22.  Aspect  spheres  and  transition  matrices  for  (a)  target  1 .  (b)  target  2.  and  (c)  target  3.  The  spheres  were  generated 
with  independent  ART -2  networks.  For  each  target,  approximately  3000  views,  collected  for  even  azimuths  360  around  and 
depression  angles  from  15  to  32  .  have  been  compressed  into  34.  34.  and  75  categories  for  targets  1 . 2,  and  3.  respectively. 
The  categories,  or  aspects,  have  been  assigned  colors  and  are  shown  on  a  viewing  sphere  from  the  side  with  the  taiget 
facing  left  (top  row  of  photographs),  and  from  above  (center  row  of  photographs).  Note  the  category  islands  that  emerge 
over  large  viewing  extents,  particularly  for  target  1 .  The  fineness  of  categorization  is  controlled  by  the  vigilance  parameter  of 
the  ART  2  network,  and  ran  be  chosen  to  be  more  or  less  sensitive  to  variations  in  the  feature  patterns.  The  vigilance  g  is 
0.97.  0.98.  and  0.99  for  targets  1 . 2.  and  3,  respectively.  The  last  row  of  photographs  shows  the  category  transitions  that  were 
learned  for  each  target  by  independent  Aspect  networks  coupled  to  each  ART-2  network.  Each  transition  matrix  codes 
possible  category  transitions  m  red.  while  green  denotes  the  absence  of  such  a  transition  (cf.  Figme  13).  Note  that  the 
transition  patterns  are  guite  different  among  the  targets.  Detected  transitions  between  categories  contribute  to  the 
evidence  accumulation  during  the  recognition  process. 


101 


•  WAXMAN  El  M  . 

Neural  System*  for  Automatic  I  arget  l  earning  and  Recognition 


Icm  with  many  A  I  R  studies.)  The  test  imagery  was 
passed  through  the  early  modules  of  our  system,  and 
then  to  four  ART- 2  networks  (with  learning  turned 
off)  coupled  to  four  Aspect  networks  corresponding 
to  the  training  targets  1,  2,  and  3,  in  addition  to  a 
fourth  unlearned  target  (referred  to  as  “other  )  that 
was  represented  by  random  synaptic  weights.  Kach 
ART- 2  network  determined  the  best  matching  aspect 
category  for  the  test  target,  and  each  ART- 2  network 
activated  its  corresponding  Aspect  network  to  accu¬ 
mulate  evidence  over  the  test  view  sequence.  A  com¬ 
petition  between  the  object  nodes  in  the  different 
Aspect  networks  then  selected  the  target  with  the 
instantaneous  maximum  evidence. 

In  Figure  24,  the  category  sequence  recognized  hv 
the  ART- 2  network  for  target  1  is  illustrated  both  on 
an  aspect  sphere  seen  from  above,  and  as  a  graph  of 
category  versus  view  number.  The  second  graph  in 
Figure  24(b)  plots  the  accumulation  of  evidence  for 
each  target,  while  the  third  graph  indicates  the  se¬ 
lected  target  with  the  maximum  evidence  accumu¬ 
lated  at  each  view.  Although  the  selected  target  is 
target  1 .  we  can  see  that  target  3  also  accumulates  a 


significant  amount  of  evidence.  Thus  selecting  target 
!  solelv  on  the  basis  of  maximum  evidence  can  be 
risky  because  the  evidence  tor  target  1  may  exceed 
that  tor  target  3  bv  onlv  a  slight  amount.  I  his  possi¬ 
bility  suggests  looking  at  the  differential  evidence  be¬ 
tween  the  tvw.  targets  of  highest  accumulated  evi¬ 
dence,  as  illustrated  in  the  fourth  graph.  The 
differential  evidence  may  be  small  for  some  views,  but 
it  tot)  can  be  integrated  along  the  temporal  view 
sequence,  giving  rise  to  a  dynamic  confidence  measure. 
As  shown  in  the  bottom  graph  of  Figure  2-t,  the 
confidence  measure  increases  monotonically  along  the 
view  sequence  in  this  example.  It  is  a  matter  of  prefer¬ 
ence  to  select  the  threshold  level  of  confidence  that 
the  system  should  use  in  declaring  a  target  as  recog¬ 
nized.  Clearly,  the  number  of  views  required  to  reach 
this  confidence  threshold  will  depend  on  the  target 
itself,  as  well  as  the  starting  view  in  a  sequence. 

SAR  Image  Conditioning  Using 
BCS/FCS  Networks 

We  have  already  noted  that  single-channel  SAR  imag¬ 
ery  is  characterized  by  a  very  large  dynamic  range  and 


FIGURE  23.  Aspect  sphere,  example  typical  views,  and  corresponding  learned  templates  for 
target  1  of  Figure  22(a).  The  learned  templates  include  two  broadside  views,  frontal  and 
end-on  views,  and  the  characteristic  L-shapes  near  the  four  corner  views.  Note  the  ability  of 
the  ART-2  network  to  quantize  the  viewing  space  around  a  target  in  an  unsupervised 
fashion.  The  learned  templates  are  then  used  for  the  recognition  process. 


in.  ■  V  it  \  |  jgiwA  ’  t)S  .  ;m.'K\ A;  ,!h  \*i  n 


102 


•  WAX  MAN  HT  AI. 

Neural  Sy>U'tn>  for  Aulonutn  l urget  /  t  urning  utui  AVt  opinion 


excessive  speckle  noise,  and  nun-ma  Je  objects  possess 
rather  broken  signatures  that  vary  rapidly  with  small 
changes  in  viewing  angle.  To  a  great  extent,  we  can 
alleviate  these  problems  by  first  conditioning  the  im¬ 
agery  with  the  Boundary  Contour  System  and  Feature 
Contour  System  (BCS/FCS)  network  paradigm  devel¬ 
oped  by  S.  Grossberg,  E.  Mingolh,  and  D.  Todorovic, 
(see  chapters  1  to  4  in  Reference  19).  This  neural 
processing  architecture  is  strongly  motivated  by  the 
known  anatomy  and  physiology  of  the  early  visual 
processing  stages,  including  that  of  the  retina,  LGN, 
Vl,  V2,  and  V4.  The  architecture,  which  essentially 
incorporates  a  general  theory  of  preattentive  vision, 
has  also  been  quite  successful  in  explaining  a  very 


large  body  of  psychophysical  perceptual  data. 

I  he  BCS/FCS  networks  are  shown  as  an  alter¬ 
native  First  module  in  the  AI  R  system  of  Figure 
18.  Our  preliminary  work  indicates  that  the  initial 
processing  of  SAR  imagery  with  BCS/FCS  networks, 
in  lieu  of  a  shunting  center-surround  network,  im¬ 
proves  the  target  detection  (and  false-alarm  rejection) 
process. 

Preattentive  vision,  in  its  simplest  form,  is  a  com¬ 
putational  process  in  which  contours  are  contextually 
established  and  the  perceived  brightness  (and  color)  is 
generated  primarily  from  local-contrast  information. 
In  BCS/FCS  theory,  the  role  of  the  BCS  network  is  to 
establish  such  contours  in  the  context  of  local  fields  of 


o 

Ol 

0) 

15 

U 


<u 

o 

c 

0) 

-o 

’> 

LU 


o  Target  3 

CB 

a  Target  2 
Target  1 
c  Other 


<5 

CD 

■4— < 

o 

c 

c 

0) 

l— 

CD 

CD 

~o 

’> 

Q 

03 

03 

<J 

C 

03 

~o 

k+_ 

c 

o 

CJ 

0.5 


0.0 


FIGURE  24.  Example  of  target  recog¬ 
nition  by  evidence  accumulation:  (a) 
aspect  sphere  and  (b)  recognition  re¬ 
sults.  The  ISAR  image  sequence 
used  consists  of  45  views  of  target  1 
(only  the  odd  azimuths  in  the  inter¬ 
val  67°  to  157°)  at  a  depression  angle 
of  21°.  The  category  sequence  rec¬ 
ognized  by  the  ART-2  network  for 
target  1  is  represented  both  on  an 
aspect  sphere  seen  from  above  in 
part  a,  and  as  a  plot  of  category  ver¬ 
sus  view  number,  as  shown  in  the 
first  graph  of  part  b.  The  next  graph 
shows  the  evidence  generated  by  the 
resulting  category  matches  for  the 
training  targets  1, 2,  and  3.  The  third  graph  indicates  the  "winning  object,"  i.e.,  the  selected  target  with  the  maximum 
evidence  accumulated  at  each  view.  The  target  with  the  instantaneous  maximum  evidence  is  consistently  target  1, 
although  target  3  also  has  a  strong  response.  The  differential  evidence  between  those  two  targets  is  plotted  in  the 
fourth  graph,  and  integrated  across  view  numbers  in  the  fifth  graph  to  generate  a  monotonically  increasing  confidence 
measure. 


.'01  Li Vf  0  •,.1'RLl!  ' 


VlPi\  i  iROSAIOR-  .T.  R\i, 


103 


•  WAXMAN  ETAL. 

Neural  Systems  for  Automatic  Target  Learning  and  Recognition 


Cooperative-competitive  (CC)  loop 


Legend 


Center- 

surround 

Q  receptive  field 
(discount 
illuminant) 


0 

0 


Oriented- 

contrast 

detector 

("edge- 

detection) 

Oriented- 

contrast 

detector 

(contrast 

rectification) 


Spatial 

competition 

(spatial 

sharpening) 

Orientational 

competition 

(orieniutional 

sharpening) 

On-off  antago¬ 
nism  (spatial 
impenetrability) 


cxz> 


(a) 


Bipole  receptive 
field  (long-range 
grouping  and 
completion) 


Boundary 
contours 
from  BCS 


FCS 

diffusion 


Contrast 
signals 
from  center- 
surround 
field 


o  o  o 

(b) 


edge  fragments,  or  oriented  contrast.  AUh^  igh  such 
boundary  contours  are  themselves  invisible,  they  modu¬ 
late  the  dynamics  of  a  diffusive  tilling-in  process  m 
the  FCS  network  whereby  local  contrast  and  bright¬ 
ness  information  mix  and  spread  within  such  bound¬ 
aries  to  create  a  smoothlv  shaded  figure. 

The  architecture  of  the  BCS  network  is  illustrated 
in  Figure  25(a).  Beginning  with  monocular  prepro¬ 
cessing  in  the  form  of  shunting  center-surround  re¬ 
ceptive  fields,  local  measures  of  normalized  isotropic 
contrast  are  made.  An  oriented-contrast  filter  then 
derives  evidence  for  local  edge  fragments,  which  are 
then  used  as  input  to  a  cooperative-competitive  (CC) 
feedback  loop.  The  CC  loop  performs  long-range 
completion  of  contours  in  the  context  of  the  local 
edge  statistics.  We  have  found  that  one  pass  through 
the  CC  loop  is  typically  sufficient  for  our  purposes. 
The  boundary  contours  obtained  hom  the  BCS  net¬ 
work  provide  input  to  the  FCS  filling  in  network, 
along  with  the  center-surround  contrast  signals,  as 
shown  in  Figure  25(b).  Essentially,  the  contrast  sig¬ 
nals  try  to  spread  diffusively  to  neighboring  nodes, 
but  the  BCS  signals  modulate  the  local  diffusivity 


FIGURE  25.  Architecture  of  the  (a)  Boundary  Contour  Sys¬ 
tem  (BCS)  of  S.  Grossberg  and  E.  Mingolla  and  the  (b) 
Feature  Contour  System  (FCS)  of  Grossberg  and  D. 
Todorovic.  The  BCS  architecture  in  part  a  models  the 
neurodynamics  of  preattentive  visual  processing  in  the 
LGN,  VI ,  and  V2  visual  areas  of  the  brain.  Shunting  cen¬ 
ter-surround  receptive  fields  provide  input  to  oriented- 
contrast,  or  edge,  detectors  that  compete  across  position 
and  orientation.  The  resulting  local  edge  fragments  are 
grouped  over  large  distances  by  oriented  bipole  receptive 
fields  that  feed  back  to  the  orientrd-contrast  detectors  to 
complete  broken  boundaries  comprising  the  edge  frag¬ 
ments.  The  boundary  contours  obtained  from  the  BCS 
network  provide  input  to  the  FCS  network,  which  uses  the 
information  to  modulate  the  local  diffusivity  between  com¬ 
partments,  as  shown  in  part  b.  The  local  diffusivity  be¬ 
tween  compartments  affects  the  FCS  diffusion  layer,  where 
the  contrast  signals  from  the  shunting  center-surro'md 
network  spread  laterally  in  two  dimensions.  In  the  diffu¬ 
sion  layer,  strong  boundary  contours  inhibit  diffusion 
across  the  boundaries.  The  FCS  architecture  models  hy¬ 
pothesized  filling-in  interactions  in  the  V4  visual  area  of 
the  brain.  (Adapted  from  Figures  15  and  17  of  chapter  1  in 
Reference  19,  with  permission.  This  reference  also  con¬ 
tains  a  detailed  description  of  BCS/FCS  networks.) 


•  WAXMAN  hi  At. 

fur  Automaln  target  l  turning  unii  httognuum 


such  that  stront;  boundary  contours  inhibit  diffusion 
across  the  boundaries.  1  hus  the  boundary  contours 
impede  the  spreading  contrast  signals.  In  the  presence 
of  a  dense  web  of  boundary  contours  of  varying 
strength,  the  K  S  diffusion  process  results  in  smoothly 
shaded  images  while  retaining  sharp  transitions  in 
brightness. 

in  applying  BCS 'FCS  processing  to  SAR  imagery, 
various  parameters  in  the  governing  dynamical  sys¬ 
tem  need  to  be  selected  so  that  the  pixel  values  are  not 
discounted  completely  (because  the  original  SAR  im¬ 
age  brightnesses  are  actually  reflectance  measures). 
We  can  then  preserve  the  ordering  ot  the  resulting 
brightnesses  in  fairly  uniform  areas  so  as  to  mimic  the 
ordering  of  the  initial  reflectance  values.  In  nonuni- 
form  areas,  however,  the  resulting  signals  indicate  a 
mixture  of  reflectance  and  local  contrast.  I  ho  overall 
effect  is  SAR  imagery  with  significantly  less  speckle 
noise,  darkened  and  sharpened  shadows,  and  more 
smoothly  shaded  signatures.  Figure  26(a),  obtained 
with  the  Lincoln  Laboratory  AD  I  S  SAR,  illustrates  a 
clutter  scene  of  iiees,  roads,  and  an  overpass  that 
crosses  the  New  York  State  Thru  way.  I  he  image  was 
t'btained  with  single-channel  VV  polarization  at  1  -ft 
resolution.  Because  ot  the  large  dynamic  range,  the 
scene  is  displayed  as  a  log-amplitude  image.  Fig¬ 
ure  26(b)  shows  tlv  same  scene  alter  BCS/FCS 
processing  ot  the  tull-dynamic-range  SAR  image. 
Note  the  dramatic  reduction  in  speckle,  the  darken¬ 
ing  ot  shadows,  the  sharpening  of  shadow  contours, 
and  the  smooth  shading  of  the  tree  tops,  roads,  and 
grass. 

Figure  27  illustrates  the  various  stages  of  process¬ 
ing  for  the  three  LSAR  targets  oriented  at  a  45°  azi¬ 
muth  and  a  1 5°  radar  depression  angle.  I  he  log- 
amplitude  ISAR  imagery  is  shown  in  the  first  column, 
the  contrast-enhanced  output  of  the  shunting  center- 
surround  network  is  contained  in  the  second  column, 
the  boundary  contours  derived  from  the  BCS  net¬ 
work  are  given  in  the  third  column,  and  the  smoothly 
shaded  signatures  obtained  from  the  FCS  network  are 
in  the  fourth  column.  An  important  attribute  of  the 
FCS  filled- in  signatures  is  that  they  are  quite  stable 
with  respect  to  small  changes  in  target  orientation. 

For  A  I  R  applications  involving  SAR  imagery,  BCS/ 
F(  S  processing  is  a  useful  image-conditioning  procc- 


(b) 


FIGURE  26.  SAR  image  conditioning  with  BCS/FCS  net¬ 
works:  (a)  original  SAR  image  of  an  overpass  that  crosses 
the  New  York  State  Thruway  and  (b)  image  after  BCS/FCS 
processing.  The  original  single-channel  VV-polarization 
image  (shown  as  log  amplitude  of  the  reflectance)  is  cor¬ 
rupted  by  speckle  noise,  which  results  in  many  false  alarms 
making  target  detection  difficult.  In  the  BCS/FCS-pro- 
cessed  image,  note  the  reduction  of  speckle  noise,  the 
darkening  of  shadow  areas,  and  the  crispness  of  the 
shadow  contours.  Such  image  conditioning  improves  tar¬ 
get  detection  while  reducing  false  alarms.  The  SAR  image 
was  obtained  with  the  Advanced  Detection  Technology 
Sensor  (ADTS),  a  Lincoln  Laboratory  millimeter-wave 
radar. 


105 


•  WAXMAN  ET  AL. 

Neural  Systems  for  Automatic  Target  Learning  and  Recognition 


dure.  We  expect  it  to  improve  both  the  target  detec¬ 
tion  and  recognition  stages  of  our  AI  R  system. 

Reentry- Vehicle  Recognition 
from  ISAR  Sequences 

We  have  also  applied  our  SAR  target-recognition  sys¬ 
tem  to  the  identification  of  reentry  vehicles  imaged 
by  a  ground  radar  while  the  vehicles  were  spinning 
and  traveling  along  a  trajectory.  The  resulting  reentry- 
vehicle  images  are  thus  ISAR  imagery,  although  they 
are  in  general  simpler  than  that  obtained  with  tactical 
targets  in  clutter.  Radar  processing  is  typically  done  in 
the  range-Doppler  domain  to  extract  peaks  corre¬ 
sponding  to  isolated  scattering  centers  on  the  vehicle’s 
shroud.  We  can  then  apply  to  these  data  the  same 
three  stages  of  processing  that  we  applied  to  the  ISAR 
tactical-target  imagery:  the  range-Doppler  peaks  can 
be  used  as  point  feature  patterns,  the  patterns  can  be 
encoded  by  overlapping  receptive  fields  followed  by 
classification  with  ART-2,  and  evidence  and  confi¬ 
dence  can  be  accumulated  with  the  Aspect  network. 
(Note:  An  alternative  approach  to  the  learning  and 
recognition  of  reentry  vehicles  has  recently  been  re¬ 
ported  by  A.M.  Aull  et  al.  [20].) 

With  this  approach  in  mind,  we  have  constructed 
ISAR  imagery  of  point  scatterers  for  three  reentry 
vehicles  over  several  rotations  at  a  single  angle  of 
attack  (i.e.,  a  single  depression  angle).  The  vehicles 
are  designated  as  RV-I,  RV-2,  and  RV-3.  Figure  28 
illustrates  the  result  of  coding  and  ART-2  category 
learning  for  vehicle  RV-2  at  a  vigilance  setting  of 
0.95.  Aspect  categorization  over  multiple  rotations 
are  shown  on  an  aspect  sphere,  along  with  the  learned 
templates  and  typical  feature  patterns  for  the  six  cat¬ 
egories  that  ART-2  established.  Figure  29  shows  the 
results  of  separate  ART-2  categorizations  for  all  three 
reentry  vehicles  over  multiple  rotations,  as  well  as  the 
learned  transitions  among  aspect  categories  used  by 
the  Aspect  network.  The  three  vehicles  differ  in  their 
complexity,  which  is  reflected  by  the  number  of  cat¬ 
egories  required  by  ART-2  to  cluster  the  data:  3,  6, 
and  16  categories  for  RV-1,  RV-2,  and  RV-3, 
respectively. 

The  results  of  a  recognition  experiment  are  shown 
in  Figure  30  (compare  to  Figure  24).  In  the  experi¬ 
ment,  a  sequence  of  views  of  vehicle  RV-3  was  input 


to  the  system.  (Note:  This  sequence  w-as  not  part  of 
the  data  used  for  training  the  system.)  Again,  evi¬ 
dence  was  accumulated  for  all  three  targets  in  addi¬ 
tion  to  an  unlearned  target,  the  target  of  maximum 
evidence  was  chosen,  difil.cntial  evidence  was  com¬ 
puted  from  the  two  targets  of  highest  evidence,  and 
the  difference  was  integrated  along  the  view  sequence 
to  generate  a  confidence  measure.  In  Figure  30  we  see 
an  example  of  the  system  changing  its  selection.  The 
system  first  (correctly)  chooses  RV-3  although  the 
confidence  is  still  relatively  low,  then  the  system  gets 
confused  and  switches  between  the  other  two  ve¬ 
hicles.  The  switching  resets  the  confidence  to  zero, 
and  it  remains  very  small  due  to  the  small  differential 
evidence  generated.  Finally,  the  system  locks  back 
onto  the  correct  decision,  and  confidence  builds 
monotonically. 

Table  1  (page  109)  summarizes  the  results  of  pre¬ 
liminary  recognition  experiments  on  these  three  reen¬ 
try  vehicles.  In  each  case  the  test  sequence  consisted 
of  90  images  starting  at  randomly  selected  azimuths. 
In  all  cases  the  correct  vehicle  was  recognized,  and 
fewer  than  25  images  were  required  in  each  sequence 
to  converge  to  a  high-confidence  correct  decision.  By 
converting  this  result  to  the  fraction  of  each  vehicle's 
rotation  cycle  that  is  required  to  achieve  such  recogni¬ 
tion,  we  find  that  fewer  than  two  revolutions  were 
required  in  each  case. 

Learning  and  Recognition  Using 
Salient  Object  Parts 

The  3-D  object  learning  and  recognition  system  de¬ 
scribed  thus  far  processes  the  views  of  objects  as  a 
whole.  But  this  approach  can  lead  to  a  decline  in 
recognition  ability  when  an  object  is  partially  oc¬ 
cluded  or  disguised,  or  when  a  part  of  the  object  is 
articulated  or  variable  (removed  or  replaced).  To 
deal  with  these  situations,  we  return  to  biolog)'  for 
guidance. 

The  brain  processes  information  by  using  a  prin¬ 
ciple  of  contrast.  Many  operations  seem  to  be  cast  in 
terms  of  differences,  or  in  terms  of  the  detection  of 
novelties  or  transitions  in  space,  time,  or  patterns. 
Mechanisms  exist  that  detect  novel  changes,  as  re¬ 
flected  by  the  peak  in  EEC  measurements  that  occurs 
300  msec  after  the  introduction  of  an  unexpected 


106 


■n-.fiU; 


\\  UMAX  I  I  VI  . 


I/,:..... 


FIGURE  27.  BCS/FCS  processing  applied  to  the  ISAR  images  of  (a)  target  1.  (b)  target  2.  and  (c)  target  3.  From  left  to  right, 
the  columns  show  different  stages  of  the  BCS/FCS  processing.  The  ISAR  imagery  (first  column)  is  contrast  enhanced  by  a 
shunting  center-surround  network  (second  column)  and  boundary  contours  are  extracted  (third  column).  The  contrast- 
enhanced  imagery  diffuses  within  the  boundary  contours  to  produce  filled-in  target  signatures  (fourth  column).  All  three 
targets  are  oriented  at  a  45  azimuth  and  a  15  radar  depression  angle. 


FIGURE  28.  Learned  categories  for  reentry  vehicle  RV-2  are  plotted  on  an  aspect  sphere  over  four  rotations  of  the  target. 
(For  convenience,  the  data  for  each  of  the  four  rotations  have  been  plotted  on  the  aspect  sphere  at  different  shifted 
depression  angles.  Note  the  four  rounds  of  colored  dots  on  the  sphere.)  During  the  learning  process,  ART-2  generated 
only  six  categories  (at  a  vigilance  setting  of  0.95).  The  learned  templates  along  with  the  representative  scatterer  patterns  tor 
the  six  categories  are  shown.  From  the  upper  right  corner  of  the  overall  figure,  the  corresponding  colors  for  the  learned 
templates  are  dark  brown,  dark  blue,  green,  light  blue,  white,  and  light  brown. 


I0~ 


Vk'AXMAS  I  I  At  . 


\, 


/,■. 


(a)  (b)  (c) 

FIGURE  29.  Learned  categories  (aspect  spheres)  and  transition  matrices  for  the  (a)  RV-1.  (h)  RV-2.  and  (c) 
RV-3  reentry  vehicles  over  multiple  rotation  cycles.  The  vigilance  setting  is  0.95.  0.95.  and  0.96  for  the  three 
reentry  vehicles,  respectively,  and  the  resulting  number  of  categories  established  is  3.6.  and  16.  respectively. 
(The  difference  in  the  number  of  categories  reflects  differences  in  the  complexities  of  the  three  vehicles,)  In 
the  transition  matrices,  the  possible  category  transitions  are  coded  m  blue,  while  red  denotes  the  absence  of 
such  a  transition. 


new  stimulus.  Other  mechanisms  are  responsible  for 
suppressing  information  that  is  not  changing.  lor 
instance,  stabilized  retinal  images  fade  aw.iv  in  about 
one  second.  In  fact,  all  sensor)'  systems  become  ha¬ 
bituated  to  constant  or  repetitive  input  patterns,  in¬ 
deed,  human  vigilance  decreases  after  long  periods  of 
waiting,  and  we  become  bored.  I  Ins  principle  of  con¬ 
trast  has  been  exploited  earlier  in  our  system  in  the 
form  of  center-surround  receptive  fields,  edge  detec¬ 
tors.  competitive  learning,  view -transition  detec¬ 
tion.  and  confidence  estimation  via  differential  evi¬ 
dence.  We  now  use  the  principle  again,  this  time  as 
a  foundation  for  S M,i/ >.«,  hierarchical  object- 
pan  represe  ntations,  and  caricature-based  recognition 

!di  j. 

\  isual  attention  not  only  focuses  processing  power 
on  an  object  m  a  scene,  it  often  isolates  only  a  part 
of  the  ob|ce  t  lor  closi'r  inspection.  (Note:  Ivi 


deuce  for  such  a  finely  tuned  attention.!]  mech¬ 
anism  has  been  found  in  psychological  studies  in 
which  subjects  are  dcmotisirablv  unaware  of  stimuli 
external  to  the  attended  visual  area.)  A  serial  exami¬ 
nation  of  the  object  takes  place  in  which  the  ex 
animation  is  focused  on  the  different  of  the 

stimuli,  which  mav  or  may  not  correspond  to  differ¬ 
ent  parts  of  the  object.  But  what  actually  constitutes 
an  obji  i  t  pan: 

lor  a  specific  recognition  task,  some  parts  mav 
carry  more  information  than  other  parts,  and  deter¬ 
mining  those  kvv  parts  and  the  amount  of  informa¬ 
tion  they  carry  depends  on  the  specific  task,  lor 
example,  human  faces  tv  pic.illv  have  two  e  ves,  so  that 
particular  piece  of  information  is  not  vetv  useful  in 
discriminating  between  different  people,  although  it 
would  be  useful  m  differentiating  human  faces  from 
clock  laces.  In  a  tactical  milu.irv  application,  we  need 


•  WAXMAN  tT  A L. 

Neural  System*  for  Automata  target  l  earning  and  Recognition 


It 


o 

cn 

a> 

(0 

o 


a> 

u 

c 

<D 

■a 


FIGURE  30.  Example  of  recognition 
by  evidence  accumulation:  (a)  as¬ 
pect  sphere  and  (b)  recognition  re¬ 
sults  (cf.  Figure  24).  The  category 
sequence  recognized  by  the  ART-2 
network  for  reentry  vehicle  RV-3  of 
Figure  29(c)  is  represented  both  on 
an  aspect  sphere  seen  from  abo-"’ 
in  part  a,  and  as  a  plot  of  category 
versus  view  number,  as  shown  in 
the  first  graph  of  part  b.  The  next 
graph  shows  the  evidence  generated 
for  the  three  different  reentry  ve¬ 
hicles,  the  following  graph  shows 
the  selected  target  with  the  maxi¬ 
mum  evidence,  and  the  last  two 
graphs  show  the  differential  evi¬ 
dence  between  the  two  highest  scor¬ 
ing  targets  and  this  differential  evi¬ 
dence  integrated  along  the  view  sequence  to  give  a  measure  of  confidence.  In  this  case,  the  system  initially 
identifies  the  target  correctly  and  confidence  grows,  although  the  differential  evidence  remains  small.  But  the 
system  then  changes  its  decision,  causing  the  confidence  to  be  reset  to  zero.  Finally,  the  system  reverts  to  the 
correct  identification  and  locks  in  on  that  decision,  and  confidence  grows  monotonically. 


Table  1.  Reentry-Vehicle  Recognition  Results 


Test 

Vehicle 

Number  of  Images 
in  Test  Sequence 

Correct 

Recognition 

Number  of  Images 
Required  for 
Convergence 

Fraction  of  Rotation 
Cycle  Required  for 
Convergence 

RV-1 

90 

Yes 

4 

0.15 

RV-2 

90 

Yes 

9 

0.33 

RV-3 

90 

Yes 

23 

1.58 

!Hi  :  i%L-U:  %  O.l s.\f. 


V'VHiS  ■ 


109 


•  WAXMAN  ET  Al. 

Neurn!  Systems  fur  Automatic  Target  l  t  aming  and  Recognition 


to  determine  what  information  is  useful  in  recogniz¬ 
ing  the  differences  between  the  various  types  of  ve¬ 
hicles  ihat  are  being  sought. 

In  our  research,  we  use  differencing  to  generate 
expectation-driven  part  segmentation  cues.  As  with 
the  3-D  object  learning  and  recognition  system  de¬ 
scribed  earlier,  in  Figure  31  the  best-match  aspect 
category  tor  tank  1  can  be  located  on  the  tank-1 
aspect  sphere.  The  category  carries  with  it  a  learned 
template  of  the  invariant  appearance  of  the  object. 
For  the  extension  to  part-based  representations,  the 
category  must  also  carry  a  more  complete  description 
to  include  characteristic  attributes  of  the  object  such 
as  scale,  orientation,  context,  and  other  information. 
(Recall  the  what  and  where  visual  pathways,  and  their 
interaction,  mentioned  earlier.)  In  addition  to  a  de¬ 
scription  of  specific  views  of  specific  objects,  the  sys¬ 
tem  also  requires  a  description  of  the  views  of  generic 
(i.e.,  average)  objects  of  a  class.  The  generic-object 
description  is  necessary  to  represent  efficiently  the 
hierarchical  descriptions  that  have  been  learned,  as 
well  as  to  navigate  quickly  through  the  representation 
during  the  recognition  phase,  as  described  below.  A 
generic-object  description  subsumes  the  descriptions 
of  all  the  specific  objects  that  are  associated  with  it. 
For  instance,  a  generic  cannon-tank  side  view  is  the 
average  of  the  side  views  of  all  tanks  that  have  can¬ 
nons.  Thus  the  generic  view  is  a  generalized  compos¬ 
ite  representation. 

After  an  ART-2  category  is  activated,  the  next  step 
in  the  object-part  process  is  to  compare  the  descrip¬ 
tion  associated  with  the  activated  category  node  of 
tank  1  v.  h  the  corresponding  previously  learned  de¬ 
scription  for  ’he  generic  tank.  The  differences  be¬ 
tween  the  two  descriptions  are  reported  in  the  form  of 
a  visual  map  called  the  Saliency  Map.  If  all  tanks  have 
exactly  the  same  treads,  turret,  and  cannon,  then 
these  parts  are  not  salient  to  the  recognition  or  dis¬ 
crimination  tasks,  and  they  will  not  appear  in  the 
Saliency  Map.  On  the  other  hand,  if  the  gun  is  longer 
for  tank  1  than  for  other  tanks,  then  this  difference 
will  be  evident  in  the  Saliency  Map,  and  the  degree  to 
which  it  is  highlighted  is  used  to  prioritize  the  serial 
attentional  examination  strategy. 

Of  course,  an  input  image  can  activate  (to  various 
degrees)  the  category  nodes  in  many  different  tank 


ARI-2  networks.  Flach  of  these  category  nodes  has  a 
corresponding  view-description  template  with  other 
associated  information,  including  its  own  Saliency 
Map.  F.ach  tank’s  Saliency  Map  indicates  which  parts 
are  most  salient  to  discriminating  that  particular  tank, 
and  the  saliencies  predict  which  parts  should  be  in  the 
image  from  that  vantage  point,  if  the  object  in  ques¬ 
tion  is  indeed  that  particular  tank.  The  predictions 
become  expectation-driven  attentional  cues  for  seg¬ 
menting  the  most  salient  parts  of  the  image,  as  shown 
in  Figure  32.  With  Saliency  Maps,  we  not  only  know 
what  parts  to  look  for,  but  we  also  know  whereto  look 
for  those  parts  relative  to  other  parts  and  to  the  object 
as  a  whole.  As  each  expectation  is  investigated,  it 
either  confirms  or  contradicts  the  hypothesized  de¬ 
scription,  and  evidence  is  accumulated  or  dissipated 
for  each  potential  model  target. 

A  Saliency  Map  can  be  obtained  for  a  particular 
object  by  computing  the  difference  between  the  de¬ 
scription  of  the  characteristic  view  of  that  object  and 
the  corresponding  description  of  the  characteristic 
view  of  the  class  of  objects  to  which  that  particular 
object  belongs,  i.e.,  the  generic  object.  Figure  3 1  illus¬ 
trates  this  process.  (Note:  For  simplicity,  the  Saliency 
Map  shown  in  Figure  31  was  derived  from  the  origi¬ 
nal  gray-scale  imagery.  In  a  complete  implementa¬ 
tion,  however,  the  Map  should  be  obtained  from  an 
invariant  description  of  a  view,  such  as  a  log-polar 
mapped  image  with  the  illuminant  discounted  in  the 
case  of  passive  visible  sensors.)  As  an  example,  we 
might  have  a  generic  class  of  objects  that  are  tanks 
with  cannons,  turrets,  and  treads,  and  included  within 
that  class  we  might  have  M48  and  M60  tanks.  Then 
the  Saliency  Map  for  the  front  view  of  an  M60  tank 
represents  the  differences  between  the  front  view  of 
the  M60  and  the  front  view  of  a  similar  class  of  tanks 
in  general.  Such  differences  are  referred  to  as  “activ¬ 
ity” — the  greater  the  difference  in  a  particular  area  of 
the  Saliency  Map,  then  the  greater  the  activity  in  that 
part.  Areas  in  which  there  are  no  differences  (i.e.,  no 
activity)  are  ignored  in  the  scheduling  o!  attentional 
shifts. 

Using  Saliency  Maps,  we  can  organize  a  hierarchi¬ 
cal  representation  o)  the  learned  objects.  Figure  33 
illustrates  an  example  hierarchy  of  tanks.  Beginning 
at  the  upper  left  are  the  descriptions  of  a  generic  tank 


110  nitUHCiUMABORAinRY  JOURNAL  VOtUMi  6  NUMBf  R  '  1 993 


•  WAXMAN  H  A1 . 

■Vi'ii y.il  .S)-.'/r»i.>  for  AhluHhtth  I  argi  l  l  earning  am!  Kt  uigumnn 


Generic  tank 


FIGURE  31.  Construction  of  a  Saliency  Map  and  corresponding  caricature  image  for  the  side  view  of  tank  1,  an  M60tank.  The 
Saliency  Map  is  created  by  taking  differences  between  the  side  view  of  tank  1  and  the  side  view  of  a  similar  class  of  tanks  in 
general.  (This  class  of  tanks  is  collectively  called  a  generic  tank).  The  caricature  image  emphasizes  the  salient  parts  of  tank 
1  with  respect  to  the  generic  tank.  The  salient  parts  in  a  Saliency  Map  are  used  to  generate  attentional  cues  for  the 
recognition  and  discrimination  of  a  particular  object  among  similar  objects,  and  the  use  of  caricatures  increases  the 
efficiency  of  this  process. 


from  various  aspects.  If  all  we  desire  -  discriminate 
between  tanks  and  aircraft,  then  this  level  of  descrip¬ 
tion  may  be  adequate.  If,  instead,  we  desire  to  dis 
criminate  between  a  flamethrower  tank  and  a  cannon 
tank,  then  more  detailed  descriptions  indicating  the 
information-carrying  attributes  of  both  types  of  tanks 
are  needed.  The  Saliency  Maps  described  earlier  natu¬ 
rally  contain  this  information,  so  that  if  an  object  has 
been  determined  to  be  a  tank,  the  Saliency  Maps 
indicate  exactly  what  must  be  investigated  to  make  a 
more  refined  decision  about  which  specific  tank  bit 
object  is.  Once  the  tank  has  been  recognized  as  a 
cannon  tank,  cither  an  M48  or  an  M60  in  this  ex¬ 


ample,  additional  Saliency  Maps  indicate  the  differ¬ 
ences  between  these  two  types  of  tanks  and  the  ge¬ 
neric  cannon  tank.  Although  Figure  33  shows  only  2- 
way  branching,  the  branching  often  is  A-way. 

Caricatures  of  the  object  descriptions  can  be  used 
to  increase  the  efficiency  of  the  recognition  process. 
For  the  recognition  of  human  faces,  there  are  many 
different  possible  facial  caricatures  that  can  be  used, 
depending  on  what  qualitie*'  are  emphasized.  A  cari¬ 
caturist  might  emphasize  age,  sex,  beauty,  or  simply 
the  differences  evidenced  between  a  particular  face 
and  a  corresponding  generic  age-matched,  sex-matched 
face.  PJ.  Benson  and  D.I.  Perrett  j 8]  have  demon- 


ill 


•  WAXMAN  ET  At. 

Neur.il  Systems  fur  Automatic  Target  learning  and  Recognition 


Object-part  decomposition 


2-D  view  processing  2-D  view  classification 


FIGURE  32.  Conceptual  approach  to  the  learning  and  recognition  of  class-object-part  hierarchies.  Again,  views  are 
quantized  into  aspects  through  the  use  of  unsupervised  learning,  but  objects  of  a  class  are  averaged  together  to  form 
generic-object  representations.  Differences  between  specific  objects  and  the  generic  object  of  that  class  are  highlighted 
on  a  Saliency  Map  (Figure  31),  which  is  then  used  to  focus  attention  on  salient  parts  during  the  recognition  process. 
Recognized  aspect  categories  for  salient  parts  generate  evidence  for  targets.  In  addition,  the  categories  prime  the  system 
with  expectations  for  other  parts  at  certain  locations. 


strated  a  reduction  in  reaction  time  in  the  recognition 
task  for  subjects  who  are  shown  a  caricatured  face 
versus  a  non-caricatured  face. 

Caricaturing  occurs  naturally  in  the  class-object- 
part  hierarchical  representations  of  Figure  33.  Com¬ 
puting  a  difference  map  between  a  description  of  an 
input  target  image  and  a  previously  learned  generic 
description  leads  to  the  detection  of  differences  be¬ 
tween  the  two  descriptions.  With  that  information, 
the  differences  can  then  be  emphasized,  resulting  in  a 
caricature  of  the  input  description.  Because  certain 
parts  in  the  caricatured  map  have  been  exaggerated, 
they  stand  out  even  more  strongly,  and,  because  the 
non-differences  have  been  suppressed,  attention  can 
be  focused  more  quickly  on  the  parts  of  the  input 
image  description  that  are  most  unusual  and  there¬ 
fore  most  likely  to  carry  discrimination  information. 
Figure  31  contains  an  example  caricature  image  of  a 
tank. 

Visual  Navigation  by  MAVIN 

The  ATR  system  design  described  in  the  section  “Air¬ 
craft  Recognition  from  Visible  Image  Sequences”  has 
been  implemented  at  Lincoln  Laboratory  on  a  mobile 
robot  called  the  Mobile  Adaptive  Visual  Navigator 
(MAVIN).  Shown  in  Figure  34,  MAVIN  can  be  pro¬ 


grammed  to  travel  a  reconnaissance  path,  detect  and 
track  objects  as  it  moves,  and  recognize  objects  it  has 
learned.  Arrays  of  light  bulbs,  such  as  the  ones  shown 
in  Figure  34,  have  been  used  for  the  target  objects. 
Currently,  MAVIN  is  also  able  to  recognize  silhou¬ 
ettes  of  objects  that  can  be  segmented  easily  from  the 
background.  Equipped  with  binocular  cameras,  MA¬ 
VIN  operates  in  real  time,  with  feature  extraction 
running  on  a  PIPE  video-rate  parallel-processing  com¬ 
puter,  and  all  other  neural  network  computations 
running  on  SUN  computers.  (Capable  of  1 -billion  8- 
bit  integer  operations  per  second,  PIPE  was  devel¬ 
oped  for  robotic  vision  at  the  National  Institute  of 
Standards  and  Technology  [NIST]  and  manufactured 
by  Aspex  Corp.  of  New  York.) 

Our  past  investigations  have  incorporated  the  vi¬ 
sual  learning  and  recognition  system  into  a  neural 
architecture  that  is  capable  of  supporting  various  Pav- 
lovian  behavioral-conditioning  paradigms  based  on 
learned  associations  and  expectations,  including 
excitatory  conditioning,  inhibitory  conditioning,  sec¬ 
ondary  conditioning,  and  the  extinction  of  condi¬ 
tioned  excitors  [22,  23].  We  have  recently  extended 
the  MAVIN  system  to  incorporate  the  learning  and 
recognition  of  environments  that  are  defined  by  the 
layout  of  visual  landmarks  observed  during  explora- 


12  [HI  I'NCOIN  U80RM0RV  JOURMU  VDl  UMt  6  VUMBtR  :  1833 


•  WAXMAN  ET  AL. 

Neural  Systems  for  Automatic  Target  Learning  and  Recognition 


cion  [24,  25J.  Associative  learning  methods  similar  to 
those  used  for  learning  2-D  feature  patterns  have 
been  applied  to  spatial  patterns  of  recognized  visible 
landmarks  to  establish  place  cells ,  which  qualitatively 
map  an  environment  based  on  its  visual  surround¬ 
ings.  We  are  currently  incorporating  displace  cells  into 
the  architecture  to  code  place  field  transitions  that  are 
induced  by  robot  motions.  (A  place  field  corresponds 
to  an  area  in  the  environment  where  recognized  target 
landmarks  possess  a  similar  spatial  layout.)  These  con¬ 
cepts  for  the  qualitative  mapping  and  navigation  of 
space  are  based  on  behavioral  experiments  with  rats, 
and  on  the  physiological  measurements  of  neurons  in 
the  rat  hippocampus. 

An  important  motivation  for  developing  MAVIN 


has  been  to  demonstrate  in  the  laboratory  the  system’s 
ability  to  recognize  in  real  time  both  fixed  landmarks 
and  mobile  targets  from  a  sensor  platform  that  can 
navigate  through,  explore,  and  map  an  environment, 
viewing  the  scene  from  a  variety  of  vantage  points. 
Indeed,  MAVIN  has  proven  to  be  an  excellent  experi¬ 
mental  domain  to  test  the  ATR  systems  that  we  have 
developed. 

Conclusion 

Our  strategy  of  using  the  unsupervised  learning  of 
view-based,  invariant  representations  in  conjunction 
with  evidence  accumulation  that  exploits  view  transi¬ 
tions  has  proven  effective  in  several  sensory'  domains, 
and  is  relevant  to  both  automatic  target  recognition 


Flamethrower-tank  generic  aspects 


FIGURE  33.  Hierarchical  object  representations  are  a  natural  consequence  of  the  Saliency  Map  approach.  The  Saliency 
Maps  direct  a  branching  down  from  generic  object  to  specific  target,  which  may  be  unique  because  of  some  specific  part. 
Because  of  this  hierarchy,  Saliency  Maps  can  be  used  in  the  recognition  process  to  guide  a  rapid  search  among  learned 
categories. 


•'OUiVf  f  HVIf*  •  •“<>.<  IMF  l'V'iILS  lABOK'OB,  JOJlVJi 


13 


•  WAXMAN  ET  AL. 

Xt  io.i/  S)'U  irt>  Un  Aititmnun  larval  l  i.mini'*  ,mtl  Rmiviiiiiim 


FIGURE  34.  The  Mobile  Adaptive  Visual  Navigator  (MAVIN)  developed  at  Lin¬ 
coln  Laboratory.  MAVIN,  a  mobile  robot  with  binocular  cameras,  provides  a 
testbed  for  a  passive-vision  ATR  system  in  which  the  concepts  that  underlie  3-D 
object  learning  and  recognition  have  been  extended  to  the  learning  of  represen¬ 
tations  for  environments  that  are  defined  by  distributions  of  visual  landmarks. 
This  extension  supports  the  ability  for  an  autonomous  sensor  platform  to  ex¬ 
plore,  map  (in  a  qualitative  fashion),  and  navigate  through  environments  con¬ 
sisting  of  fixed  landmarks  and  moving  targets.  The  neural  architecture  being 
developed  is  based  on  studies  of  the  rat  hippocampus. 


(AI  R)  and  environment  navigation.  But  perhaps 
the  most  important  lesson  we  have  learned  is  that 
many  valuable  insights  can  be  gained  from  serious 
study  of  the  brain  and  behavior.  Anatomical,  physi¬ 
ological,  and  psychophysical  studies  have  all  helped 
shape  the  computational  theories  and  system  archi¬ 
tectures  used  in  our  work.  We  believe  that  such  stud¬ 
ies  will  continue  to  enable  rapid  progress  in  the  ATR 
field. 

Acknowledgments 

We  wish  to  thank  the  members  of  the  Surveillance 
Systems  ('.roup  at  Lincoln  Laboratory  for  providing 
us  with  the  AD  IS  imagery  ot  clutter  and  tactical 
targets,  as  well  as  the  radar  phase  historv  data  tor  the 


ISAR  targets.  We  are  grateful  to  Jacques  Verlv  and 
Carol  l.azott  of  the  Machine  Intelligence  Technology 
Croup  for  constructing  the  ISAR  target  imagery  from 
the  radar  phase  histories.  We  are  also  indebted  to  the 
Signature  Studies  and  Analysis  Croup  tor  providing 
us  with  the  range- Doppler  peak  data.  This  work  on 
reentry-vehicle  recognition  was  done  in  conjunction 
with  Bob  (label  ot  the  Machine  Intelligence  Technol¬ 
ogy  Croup.  We  also  wish  to  acknowledge  our  ongo¬ 
ing  collaboration  with  Professors  Stephen  Crossberg 
and  Lnnio  Mingolla  of  Boston  University's  Depart¬ 
ment  of  Cognitive  and  Neural  Systems. 

I  his  work  has  been  supported  bv  the  U.S.  Depart¬ 
ment  ot  the  Air  force  and  the  Office  ot  Naval 
Research. 


114 


•  WAXMAN  ET  A L. 

Ni’Uhil  S\>ti‘»h  for  AtiUnuain  I  urge t  l  t  urning  und  Recognition 


REFERENCES 


1.  [  .A.  DeYoc  and  DC.  \  an  lessen,  C OIK  urrcnl  Processing 
Streams  in  Monkey  Visual  Cortex,  l rends  in  Neuroscience 
TINS- 11,  219  (1988). 

2.  S.  Zeki,  "  I  he  V'isual  Image  in  Mind  and  Brain,  Scientific 
American  267,  68  (Sept.  1992). 

3.  D.A.  l  av  and  A  M.  Waxman,  “Neurodvnamics  ot  Real-Time 
Image  Velocity  Extraction,"  chap.  9  in  Neural  Networks  for 
Vision  and  /inner  Processing,  eds.  Cl.  A.  Carpenter  and  S.  Gross- 
berg  (MIT  Press,  Cambridge,  MA,  1992),  pp.  221-246. 

4.  M.  Mishkin,  I..G.  Ungcrleider.  and  K.A.  Macko,  "Object 
Vision  and  Spatial  Vision:  Two  Cortical  Pathways,"  Trends  in 
Neuroscience TlNS-6,  414(1 983). 

3.  D.I.  Perrett,  A.J.  Mistlin,  and  A.J.  Chittv,  “Visual  Neurones 
Responsive  to  Faces,"  Trends  in  Ncurosciences  TINS- 10,  338 
(1987). 

6.  D.I.  Perrett,  M.H.  Harries,  R.  Bevan.  S.  Thomas,  P.J.  Benson, 
A.J.  Mistlin,  A.J.  Chittv,  J.K.  Hietanen,  and  J.F..  Ortega. 
"Frameworks  of  Analysis  lor  the  Neural  Representation  ol 
Animate  Objects  and  Actions,"  Journal  of  Experimental  Biology 
146,87(1989). 

7.  D.I.  Perrett,  M.W.  Oram,  M.H.  Harries,  R.  Bevan,  J.K.  Hict- 
anen,  P.J.  Benson,  and  S.  Thomas,  “Viewer-Centred  and  Ob¬ 
ject-Centred  Coding  of  Heads  in  the  Macaque  Temporal 
Cortex,"  Experimental  Brain  Research  86,  139  (1991). 

8.  P.J.  Benson  and  D.I.  Perrett,  “Perception  and  Recognition  ot 
Photographic  Quality  Facial  Caricatures:  Implications  lor  the 
Recognition  of  Natural  Images,”  European  Journal  of  Cognitive 
Psychology  3,  103  (1991). 

9.  M.  Seibert  and  A.M.  Waxman,  “Spreading  Activation  Layers, 
Visual  Saccades,  and  Invariant  Representations  tor  Neural 
Pattern  Recognition  Systems,"  Neural  Networks!,  9  (1989). 

10.  M.  Seibert  and  A.M.  Waxman,  "Learning  and  Recognizing 
3D  Objects  from  Multiple  Views  in  a  Neural  System,  chap. 
IF  12  in  Neural  Networks  for  Perception,  vol.  1 ,  ed.  H.  Wechsler 
(Academic  Press,  New  York,  1991),  pp.  426-444. 

11.  M.  Seibert  and  A.M.  Waxman,  “Adaptive  3-D  Object  Recog¬ 
nition  from  Multiple  Views,"  IEEE  Trans.  Pattern  Anal.  Mach. 
Intel I  14,  107  (1992). 

12.  J.J.  Koenderink  and  A.J.  van  Doom,  “The  Internal  Represen¬ 
tation  of  Solid  Shape  with  Respect  to  Vision,”  Biological  Cy¬ 
bernetics  it,  21 1  (1979). 

13.  S.  Grossberg,  "Nonlinear  Neural  Networks:  Principles,  Mech¬ 
anisms,  and  Architectures,"  Neural  Networks  1,17  (1988). 

14.  R.K.  Cunningham  and  A.M.  Waxman,  “Astroglial-Neural 
Networks,  Diffusion-Enhancement  Bilayers,  and  Spatio-Tem¬ 
poral  Grouping  Dynamics,  S/7/:  1 6 1 1,411  (1991). 

13.  R.K.  Cunningham  and  A.M.  Waxman,  “Parametric  Study  of 
Diffusion- Enhancement  Networks  tor  Spatiotcmporal  Group¬ 
ing  in  Real-Time  Artificial  Vision,”  Technical  Report  No.  ESC- 
TR-92-207,  MIT  Lincoln  Laboratory  (6  Apr.  1993). 

16.  E.L.  Schwartz,  “Computational  Anatomy  and  Functional  Ar¬ 
chitecture  of  Striate  Cortex:  A  Spatial  Mapping  Approach  to 
Perceptual  Coding,”  Vision  Research  20,  643  (1980). 

17.  G.A  Carpenter  and  S.  Grossberg,  Pattern  Recognition  try  Self- 
Organizing  Neural  Networks  (MI  L  Press,  Cambridge,  MA, 
1991),  chaps.  9-13. 

18.  B.  Widrow  and  S.  Stearns,  Adaptive  Signal  Processing  (Pren¬ 
tice-Hall.  Englewood  Cliffs,  NJ,  1983). 

19.  S.  Grossberg,  Neural  Networks  and  Natural  Intelligence  (MI  T 


Press,  Cambridge,  MA,  1988).  chaps.  1  -t 

20.  A.M.  Anil,  R.A.  Gabel,  and  1 .].  Goblick.  "Real- l  ime  Radar 
Image  Understanding:  A  Machine-Intelligence  Approach.  /  in, . 
lab.  J.  3,  193  (1992). 

21.  M.  Seibert  and  A.M.  Waxman,  “An  Approach  to  Face  Recog¬ 
nition  Using  Salience  Maps  and  Caricatures.  Pro c.  World 
Congress  on  Neural  Networks.  Portland.  OR  (to  be  published  in 

July' 1993). 

22.  A.A.  Baloch  and  A.M.  Waxman.  “Visual  l  earning.  Adaptive 
Expectations,  and  Behavioral  Conditioning  ot  the  Mobile  Ro¬ 
bot  MAVIN,”  Neural  Networks  4,  2”1  ( 1991 ). 

23.  A.A.  Baloch  and  A.M.  Waxman,  "Behavioral  Conditioning  ot 
the  Mobile  Robot  MAVIN."  chap.  6  in  Neural  Networks. 
Concepts.  Applications,  and  Implementations,  vol.  IV.  eds.  P. 
Antognetti  and  V.  Milutinovic  (Prentice-Hall,  Englewood 
Cliffs.  NJ,  1991).  pp.  162-200. 

24.  I.A.  Bachelder  and  A.M.  Waxman.  “Neural  Networks  lor 
Mobile  Robot  V'isual  Exploration."  SP/E  1831,  10”  ( 1993). 

23.  I.A.  Bachelder.  A.M.  Waxman,  and  M.  Seibert,  "A  Neural 
System  tor  Mobile  Robot  V'isual  Place  Learning  and  Recogni¬ 
tion,”  Pro,  World  Congress  on  Neural  Networks.  Portland.  OR 
(to  be  published  in  July  199.3). 


IH>  t:\C0! \  iSBOIt'OT,  .’0,  CM:  1 


;(l!  ilVf  i,  MiVBt « 


15 


•  WAXMAN  ET  AL. 

Neural  Systems  for  Automatic  Target  Learning  and  Recognition 


ALLEN  M.  WAXMAN 

is  a  senior  staff  member  in  the 
Machine  Intelligence  Technol¬ 
ogy  Group,  where  his  focus  on 
research  has  been  in  vision 
processing,  neural  networks, 
mobile  robots,  and  electronic 
aids  tor  the  visually  impaired. 
Allen  also  currently  holds  a 
joint  appointment  with  the 
Center  for  Adaptive  Systems  at 
Boston  University.  Before 
joining  Lincoln  Laboratory 
tour  years  ago,  he  was  with  the 
Department  of  Electrical, 
Computer,  and  Systems  Engi¬ 
neering  at  B.U.  He  received  a 
B.S.  degree  in  physics  from  the 
City  College  of  New  York,  and 
a  Ph.D.  degree  in  astrophysics 
from  the  University  of  Chi¬ 
cago.  In  1992.  he  was  the  co¬ 
recipient  (with  Michael 
Seibert)  ot  the  Outstanding 
Research  Award  from  the 
International  Neural  Network 
Societv. 


MICHAEL  SEIBERT 

received  a  B.S.  and  an  M.S. 
degree  in  computer  and  sys¬ 
tems  engineering  from  the 
Rensselaer  Polytechnic  Insti¬ 
tute,  and  a  Ph.D,  degree  in 
computer  engineering  from 
Boston  University.  He  has 
been  with  Lincoln  Laboratory 
for  six  years;  he  is  currently  a 
staff  member  in  the  Machine 
Intelligence  Technology 
Group.  Michael's  focus  on 
research  has  been  in  vision  and 
neural  networks,  and  in  1992 
he  was  the  corecipient  (with 
.Allen  M.  Waxman)  ot  the 
Outstanding  Research  Award 
front  the  International  Neural 
Network  Societv. 


ANN  MARIE  BERNARDON 

is  a  staff  member  in  the  Ma¬ 
chine  Intelligence  Technology 
Group,  where  her  research  has 
been  on  machine  intelligence, 
neural  networks,  and  signal 
processing.  Before  joining 
Lincoln  Laboratory  seven  years 
ago,  she  worked  for  Voice 
Processing  Inc.  She  received  the 
following  degrees  in  electrical 
engineering:  a  B.S.  from  Purdue 
Universitv  and  an  S.M.  from 
MIT. 


DAVID  A.  FAY 

received  a  B.S.  degree  in 
computer  engineering  and  an 
M.A.  degree  in  cognitive  and 
neural  svstems  from  Boston 
Universitv.  WTtile  pursuing  his 
graduate  studies  at  B.U’..  Dave 
joined  Lincoln  Laboratorv  in 
1989.  and  is  currently  a  staff 
member  in  the  Machine 
Intelligence  Technology 
Group.  His  research  has  been 
on  the  development  of  neural 
network  systems  tor  enhancing 
radar  imagery. 


116 


Multidimensional  Automatic 
Target  Recognition  System 
Evaluation 

Paul  J.  Kolodzy 

■  We  are  developing  an  evaluation  facility  that  includes  an  electronic  terrain 
board  (ETB)  to  provide  an  effective  test  environment  for  automatic  target 
recognition  (ATR)  systems.  The  input  to  the  ETB,  which  is  a  high-performance 
computer  graphics  workstation,  is  very  high-resolution  data  (15  cm  in  3-D) 
taken  with  pixel  registration  in  the  modalities  of  interest  (laser  radar,  passive  1R, 
and  visible).  The  ETB  contains  sensor  and  target  models  so  that  measured 
imagery  can  be  modified  for  sensitivity  analyses.  In  addition,  the  evaluation 
facility  contains  a  reconfigurable  suite  of  ATR  algorithms  that  can  be  interfaced 
to  real  and  synthetic  data  for  developing  and  testing  ATR  modules. 

A  first-generation  hybrid-architecture  (statistical,  model  based,  and  neural 
network)  ATR  system  is  currently  operating  on  multidimensional  (laser  radar 
range,  intensity  and  passive  IR)  sensor,  synthetic,  and  hybrid  databases  to 
provide  performance  and  validation  results.  A  recent  study  determined  the 
sensor  requirements  necessary  for  target  classification  and  identification  of  eight 
vehicles  under  various  view  aspects,  resolutions,  and  signal  strengths. 

This  article  presents  a  description  of  the  infrared  airborne  radar  used  to 
gather  sensor  data,  a  discussion  of  sensor  fusion  and  the  hybrid  ATR  measure¬ 
ment  system,  and  a  review  of  the  ATR  evaluation  facility.  This  article  also 
discusses  the  computer  manipulation  and  generation  of  laser-radar  and  passive- 
IR  sensor  imagery  and  the  processing  modules  used  for  target  detection  and 
recognition.  We  give  results  of  processing  real  and  synthetic  imagery  with  the 
ATR  system,  with  an  emphasis  on  interpreting  results  with  respect  to  sensor 
design. 

The  battlefield  scenario  continues  to  grow 
in  complexity  as  the  use  of  high-resolution 
sensors  and  precision  strike  weapons  has  forced 
the  increased  use  of  concealment  and  camouflage  tech¬ 
nology  to  improve  vehicle  survivability.  The  advent  of 
multidimensional  sensors  that  trade  individual  sensor 
performance  for  aggregate  system  performance  and 
automatic  target  recognition  (ATR)  systems  that  can 
assist  in  or  automatically  identify  targets  also  are  a 
threat  to  vehicle  survivability.  The  understanding  of 
multidimensional  sensors,  the  algorithms  that  are  used 


to  process  their  data,  and  the  manner  in  which  they 
are  evaluated  is  necessary  to  determine  their  suitabil¬ 
ity  for  military  applications. 

Unfortunately,  the  testing  and  acceptance  of  ATR 
systems  for  military  applications  has  proven  elusive. 
On  one  hand,  many  researchers  are  concerned  that 
not  enough  information  exists  in  one  sensor  modality 
to  build  an  ATR  system  that  performs  effectively 
against  targets  in  natural  and  man-made  clutter.  On 
the  other  hand,  the  use  of  multisensor  information  to 
solve  this  vexing  problem  is  relatively  recent,  and  the 


117 


•  KOLODZY 

Multidimensional  Automatic  larger  Recognition  System  l  valuation 


results  are  limited.  Although  we  have  strong  indica¬ 
tions  that  several  sensor  modalities  are  better  than  one 
lor  target  identification,  no  convincing  database  ol 
evidence  exists. 

At  Lincoln  Laboratory  we  have  constructed  a  fly- 
able  multisensor  measurement  system  to  evaluate  the 
use  of  single  and  multiple  sensor  modalities  for  search- 
and-identification  applications.  This  article  describes 
the  measurement  system,  which  includes  a  forward- 
looking  suite  of  sensors,  a  down-looking  suite  of  sen¬ 
sors,  and  an  MMW  sensor.  We  also  describe  an  AI  R 
system  for  processing  laser  radar  range  and  intensity 
imagery  as  well  as  other  sense  r  modalities, 

Lesting  the  AIR  system  to  quantify  the  perfor¬ 
mance  limits  of  the  multisensor  measurement  system 
is  an  important  step  in  the  development  of  useful 
benchmarks  and  the  definition  of  radar  requirements. 
This  article  examines  the  performance  tests  we  have 
developed  and  provides  a  summary  of  test  results  for 
spatial  extent,  image  quality,  and  3-D  recognition 
requirements.  An  ATR  evaluation  facility  is  currently 
under  development  to  provide  an  effective  test  envi¬ 


ronment  for  AIR  systems.  The  inputs  to  the  facility, 
which  is  a  high-performance  computer  graphics  work¬ 
station  and  data-processing  engine,  are  very  high- 
resolution  data  ( 1  3  cm  in  3-D)  taken  with  pixel  regis¬ 
tration  in  the  modalities  of  interest  (laser  range, 
intensity  passive  IR,  and  visible)  and  stored  in  data¬ 
bases.  An  electronic  terrain  board  (LTB)  combines 
the  databases  with  sensor  and  target  models  to  modify 
the  measured  imagery  tor  AIR  sensitivity  analyses. 

The  Infrared  Airborne  Radar 

The  Infrared  Airborne  Radar  (IRAR)  i.->  a  livable 
multisensor  measurement  system  that  consists  of  a  set 
of  active  and  passive  infrared  (IR)  and  active  millime¬ 
ter-wave  (MMW)  sensors.  I  his  system  is  installed  in 
a  Ciiilfstream  G-l  twin  turboprop  test  aircraft  used  by 
L  incoln  Laboratory;  Figure  1  illustrates  the  locations 
of  these  sensors  in  the  aircraft.  We  are  especially  inter¬ 
ested  in  the  ability  of  the  multisensor  measurement 
system  to  detect  targets  autonomously  (i.e.,  without 
human  interaction  with  the  measurement  system). 

In  the  forward-looking  sensor  suite,  the  active  laser 


85.5-GH2  MMW  radar 


Recording  devices 


Sensor  control  panels  Forward-looking  radar 

10  «=  gm  active  laser  radar 
8-to-12-g/m  passive  IR 


Video  Down-looking  radar 

10.6-^m  active  laser  radar 
8-to-12-g/m  passive  IR 
0.8-pm  active  laser  radar 


FIGURE  1.  Schematic  diagram  of  the  multisensor  measurement  system  on  the  Gulfstream  G-1 
aircraft,  showing  the  location  of  each  individual  sensor  system.  The  two  sensor  suites — forward- 
looking  and  down-looking — are  located  in  the  aft  section  of  the  aircraft,  the  recording  system  and 
electronic  racks  are  located  in  the  midsection,  and  the  antenna  for  the  MMW  radar  is  located  in  the 
nose.  The  forward-looking  sensor  suite  is  mounted  on  an  optical  table  and  then  relayed  through  a 
pod  on  the  fuselage.  The  down-looking  sensor  suite  is  housed  entirely  in  the  pod  aft  of  the  torward- 
looking  sensor  system. 


v:;„vi  r.  Mi'.'Bli)  ' 


118 


Hi  .  iMnwnwr  .iniisvc1 


•  KOl.ODZY 

Mhitulnni  Hyiini.il .  \utonuiiu  I.D  "<  i  AVt  O^Hlt/on  S \Hi  Ul  i  l  .liii.lUtnt 


radar  sensor  measures  absolute  range  with  a  precision 
of  I  m  while  the  passive-1  R  sensor  measures  the  ther¬ 
mal  intensirv  of  the  target  and  scene  in  the  8-to-12- 
iim  band.  I  lie  down-looking  sensor  suite,  which  is  a 
multispectral  active-passive  sensor,  has  the  ability  to 
measure  relative  range  with  a  precision  of  15  cm,  as 
well  as  the  ability  to  measup-  passive-1  R  thermal  in¬ 
tensity.  In  addition,  an  MMW  real-aperture  measure¬ 
ment  system  developed  by  General  Dynamics  of 
Pomona,  California,  is  installed  in  the  aircraft.  I  his 
MMW  sensor  measures  absolute  range  with  a  resolu¬ 
tion  of  0.5  m,  and  is  slaved  to  cover  the  same  search 
area  as  the  forward-looking  sensor. 

All  the  IRAR  sensors  reside  on  board  the  aircraft 
platform.  The  heart  of  the  IRAR  system  is  located  in 
the  center  section  of  the  aircraft.  A  radome  extends 
down  from  the  center  of  the  aircraft,  allowing  the 
laser  beam  of  the  forward-looking  sensor  to  exit 
through  a  germanium  window  on  the  left  side  of  the 
radome.  An  additional  window  immediately  to  the 
right  of  the  germanium  window  is  used  by  the  mea¬ 
surement  system’s  boresighted  color  television  cam¬ 
era,  which  is  used  to  point  the  laser  beam  manually 
and  to  record  a  live  sequence  of  the  measured  scene. 

I  he  radome  was  modified  so  that  the  down-look¬ 
ing  sensor  could  be  placed  immediately  behind  the 
forward-looking  laser-radar  pointing-mirror  assembly 
and  look  straight  down;  the  scan  direction  of  the 
down-looking  sensor  is  therefore  always  perpendicu¬ 
lar  to  the  longitudinal  axis  of  the  aircraft.  I  he  MMW 
system  is  sufficiently  small  so  that  the  1  -ft  diameter 
radar  dish  and  the  gimbal  mount  are  totally  enclosed 
within  the  nose  cone  of  the  aircraft. 

Forward-Looking  Laser  Radar 

I  he  transmitter  in  the  forward-looking  sensor  is  an 
RF-excited,  water-cooled,  GO,  waveguide  laser  oper¬ 
ating  at  1 0.6  /im.  In  the  pulsed  mode,  the  transmitter 
laser  provides  a  nominal  25-nsec  puisewidth  at  ap¬ 
proximately  3-W  average  power  at  a  pulse-repetition 
frequency  of  20  kHz.  In  GW  operation,  the  laser  can 
provide  power  in  excess  of  30  W. 

A  5-in  diameter,  afocal,  Ritchcy-Ghretien  telescope 
functions  both  as  the  transmit  and  receive  apern  of 
the  sensor  to  produce  a  200-/irad  diameter  beam 
( 1 00-rirad  resolution).  File  sensor  uses  two  linear  1  2- 


element  arrays  of  HgGd  le  photovoltaic  detectors: 
one  array  tor  the  active  measurements  and  one  tor  the 
passive  measurements.  Registration  of  the  active  and 
passive  measurements  is  always  assured  because  both 
arrays  share  the  common  telescope. 

In  the  present  configuration,  the  two  arrays  are 
oriented  vertically  to  piovide  a  10°  azimuthal  cover¬ 
age  at  2.5  scans/sec  in  linescan  mode.  In  a  separate 
framing  mode  (25.6  mrad  by  12.0  mrad),  the  scan¬ 
ning  mirrors  operate  at  20  trames/sec:  whet;  the  pas¬ 
sive  channel  is  enabled,  however,  the  recording  rate  is 
reducet!  ,o  If)  frames/sec  because  of  recorder  limita¬ 
tions.  lelevision  images  from  the  boresighted  I  V  cam¬ 
era  are  digitized  and  stored  on  computer  tapes.  I  able 
1  shows  selected  system  parameters  for  the  forward- 
looking  sensor. 

Figure  2  is  an  example  of  a  laser  radar  range  image 
and  a  passive- IR  image  made  simultaneously  by  the 
forward-looking  sensor.  Fwo  features  in  these  images 
are  particularly  interesting  with  respect  to  data  fusion 
and  scene  understanding;  (I)  the  road  that  traverses 
vertically  in  the  center  of  the  scene  is  clearly  visible  in 
the  passive-I R  image  in  Figure  2(b)  but  invisible  in 
the  range  image  in  Figure  2(a)  because  the  road  is  at 
the  same  elevation  as  the  local  ground  plane,  and  (2) 
although  a  tank  (at  the  center  left  of  the  scene)  has  a 
negative  passive-IR  contrast  with  the  background,  it 
has  positive  range  contrast  in  the  active  laser  radar 
image.  We  can  overcome  the  measurement  limita- 


Table  1.  Forward-Looking  Laser  Radar 
System  Parameters 


C02  laser 


Wavelenpth 

10.6  ,um 

Nominal  power,  CW 

30  W 

Pulsed,  average 

3  W 

Number  of  detectors 

12 

T  elescope  aperture 

13  cm 

Instantaneous  field  of  view 

0.2  mrad 

Range  sampling  interval 

1.1  m 

119 


•  KOI  Ol>/\ 


FIGURE  2.  (a)  Passive-IR  imagery  and  (b)  laser  radar  range  imagery  taken  simultaneously  at  Stockbridge.  New 
Yoik,  by  the  forward-looking  sensor.  The  passive-IR  ullage  in  part  a  is  coded  by  thermal  intensity,  so  that 
warmer  objects  such  as  vehic  les  are  brighter  than  cold  objects.  The  range  image  in  part  b  is  coded  by  color  to 
distinguish  objects  at  different  distances  from  tfie  viewer. 


nons  of  each  iiuli\  iiluul  sensor  bv  fusing  the  informa¬ 
tion  from  the  two  sensors  to  provide  enhanced  detec¬ 
tion  capability. 

Millimeter-  Wit  re  Radar 

To  investigate  the  advantages  of  combining  the  out¬ 
put  of  two  or  more  diverse  sensors,  we  added  the 
( iencral  Dynamics  <S3.3-(  1 1 1/  real-aperture  MMW 
radar  to  the  forward-lookinp  sensor  suite.  I  his  radar 
has  low  eross-tanpc  resolution  and  hi  silt  line-ot-sipht 
resolution,  and  operates  at  3T  mm.  I  able  1  lists  the 
operatinp  characteristics  of  this  radar.  I  he  MMW 
antenna  is  mounted  in  the  nose  section  of  the  aircraft 
and  is  horcsiphtcd  to  the  IRAK  censor  suite  chump 
pointing  mode  operation. 

I  he  cross  taupe  resolution  of  the  MMW  radar  is 
such  that  a  III  azimuthal  field  of  repaid  is  stored  as 
I  k  i ntensitv  - \ ersiis  i atiLte  profiles  on  each  scan.  I  he 
ov ersamplinp  that  occurs  in  the  down-ranpe  dimen¬ 
sion  is  then  used  to  enhance  the  procC'sinp  statistics 
lot  detection.  I  lu  modulation  characteristics  of  the 
sensoi  are  sudi  that  the  line  of  sipht  ranpe  resohinon 
is  I .  '  1 1 .  vv  hile  data  arc  sampled  at  approximate  l\  halt 
this  \. title,  thus  providing.  the  potential  foi  excellent 
t.uiL'a  tc  si ihn  ion  on  the  t.u  pc  i . 


1  iptire  3  illustrates  the  ranpe  resolution  of  the 
MMW  radar  in  combination  with  a  passive-IR  imap- 
inp  sensor.  I  hive  loppinp  trucks  in  the  passive-IR 
imatie  in  l  ipure  3(a)  are  eaeli  hiphli<phted  bv  a  box. 
The  environmental  conditions  at  the  time  the  data 
were  taken  are  responsible  for  the  low  passive-IR  eon- 

Table2.  Millimeter-Wave  Radar 
System  Parameters 


Operating  frequency 

85.5  GHz 

T ransmitter  power 

15  mW 

Modulation  format 

FMCW 

Antenna  diameter 

12  in 

Antenna  beamwidth 

0.76  .  one  way 

0.57  ,  two  way 

Range  resolution 

1.67  ft 

PRF 

1600  Hz 

Noise  figure 

20  dB.  including 

system  losses 

•  KOI  OII/\ 


FIGURE  3.  (a)  Passive-IR  imagery  and  (b)  boresighted  MMW  radar  imagery.  The  MMW  radar  data  ate 
displayed  as  a  3-D  plot  of  down-range,  cross-range,  and  thermal-intensity  values.  The  three  logging 
trucks  indicated  by  boxes  in  the  passive-IR  image  correspond  to  four  of  the  five  highest  MMW  radar 
intensity  (teaks.  Part  b  shows  two  peaks  for  the  one  truck  in  the  center  of  part  a  because  distinct  returns 
were  obtained  from  both  the  truck  cab  and  the  truck  bed. 


trust.  It  the  MMW  radar  signal  is  displayed  as  a  3-1) 
image  (cross-range,  down-range.  and  intensitv),  how¬ 
ever.  us  shown  in  l  imire  3(h),  then  four  of  the  five 

C1 

highest  intensit v  peaks  shown  in  the  figure  corre¬ 
spond  to  radar  returns  from  target  locations.  I  wo 
(leaks  are  determined  in  figure  3(h)  for  the  truck  in 
the  renter  of  ligure  3(a)  because  we  obtained  strong 
distinct  returns  from  both  the  truck  cab  and  the  truck 
bed. 

lhncn-l  nol’ii /«  /  ,/srr  RtuLir 

I  he  multispecirul  active-passive  down-looking  sensor 
is  a  (.ompact  multiple-channel  svstem  that  cmplovs 
two  lasers  for  active  detection  and  a  single  passive 
detection  channel.  I  his  sensor  is  configured  with  a 
l().(i-/im  amplitude-modulated  continuous  wave 
( A.\K  W  )  (A),  laser  ami  a  0.«S-//m  A.XH  AX  Alt  iaAs 
diode  laser  for  the  two  active  channels,  which  are 
coregistered  with  an  S-to-ld-mn  passive  detection 
channel. 

I  he  svstem  w  as  designed  w  ith  I  inrad  angular  reso¬ 
lution  to  pros  ide  a  1  a -cm  cube  on  the  target  from  an 


optimal  measurement  height  of  130  m.  The  active- 
channel  lasers  are  modulated  at  13  MM/  to  provide 
an  AMCiW  waveform  that  translates  to  a  l()-m  range 
ambiguitv  but  pros  ides  I  3-cm  precision  (i.e.,  the  range 
values  are  produced  from  0  to  10  m  in  13-cm  incre¬ 
ments  and  thev  fold  over  at  the  range  boundaries). 
1  bus  these  measurements  are  relative  range  measure¬ 
ments  with  13-cm  precision,  as  compared  with  the 
absolute  range  measurements  of  the  forward-looking 
sensor.  I  able-  3  lists  selected  parameters  of  the  multi- 
spectral  active-passive  dow  n-looking  sensor,  and  I  ig- 
urc  t  shows  five  separate  images  produced  by  this 
sensor  during  a  flvover  of  the  I'SS  ( oi/no/t'. 

I  he  muhispeetral  down-looking  sensor  has  two 
characteristics  of  interest  for  the  development  and 
testing  of  A  I  R  s\  stems:  ( I )  the’  view  ing  aspect  allows 
the  imaging  of  objects  in  clutter  that  are  not  generallv 
seen  bv  forward-looking  sensors,  and  (3)  the'  high  I  \ 
precise  range  imagers  gives  us  the  capability  to  trails 
form  the  observed  scene  to  a  varictv  of  viewing  as¬ 
pects.  l  igure  3  illustrates  this  process,  l  igure  3(a) 
contains  a  photograph  of  a  truck  that  is  camouflaged 


l’l 


• KOLODZY 

Multidimensional  Automata  larger  Rec  ognition  System  Evahuini"i 


Table  3.  Down-Looking  Laser  Radar 
System  Parameters 


Angular  resolution 
Range  precision 
Range  ambiguity  interval 
Altitude  range 
Ground  coverage 


0.5  mrad,  x  and  y axes 
15  cm 
10m 

400  ft  to  1300  ft 
2000  ft  at  1000  ft 


by  netting  and  parked  on  a  dirt  road  in  a  forest. 
Figure  5(b),  which  is  the  down-looking  range  image, 
clearly  shows  the  road  and  the  truck,  with  the  height 
of  the  truck  above  the  road  encoded  in  color.  Figure 
5(c)  is  a  computer-transformed  forward-looking  range 


image  of  the  camouflaged  truck  from  a  viewpoint 
that  is  just  above  the  road.  In  this  way,  a  down¬ 
looking  view  can  be  used  to  develop  or  test  algo¬ 
rithms  for  a  forward-looking  or  near-forward-looking 
sensor  through  the  use  of  coordinate  transformations. 
A  more  detailed  description  of  how  down-looking 
data  can  be  utilized  for  a  variety  of  ATR  evaluation 
tasks  is  given  in  the  section  entitled  "The  ATR  Evalu¬ 
ation  Facility.” 

Sensor  Fusion 

Figure  3  illustrates  the  possible  benefits  of  fusing 
MMW  radar  imagery  and  passive-lR  imagery.  This 
figure  demonstrates  that  the  MMW  radar  image  can 
be  used  to  indicate  areas  of  interest  in  a  coregistered 
passive-1  R  image.  Other  techniques  that  incorporate 
the  detection  lists  from  both  sensors  usually  fuse  the 
lists  by  an  OR  or  AND  procedure;  i.e.,  the  target 


0.8-/;m  laser  radar  10.6-/jm  laser  radar 


FIGURE  4.  Example  of  imagery  produced  by  the  multispectral  active-passive  down-looking  sensor  during  a  flyover  of  the 
USS  Connols.  This  sensor  produces  coregistered  laser  radar  range  and  laser  intensity  images  for  wavelengths  of  0.8  urn 
and  10.6  /<m,  as  well  as  an  8-to-12-/<m  passive-IR  thermal-intensity  image.  Note  the  parked  helicopter  near  the  stern  of  the 
ship  in  each  of  the  sensor  domains  as  well  as  the  depiction  of  the  ship's  wake. 


122  IHf  I  INC.01  N  !  ABORXrORV  JOlIRNAl  VOLUME  6.  NUMBER  f  1993 


•  KOI  OI)/\ 


(a)  (b)  (c) 


FIGURE  5.  (a)  Oplu  al  photouiaph  of  a  truck  covered  with  camouflage  netting  on  a  road  in  a  forest,  (b)  The  relative  lange 
image  of  the  tnui  as  dett'i mined  by  the  multispectral  down-looking  sensor,  (c)  The  3-D  spatial  transformed  image 
illustiates  the  lelative  ianc|e  mage  data  in  part  das  viewed  (torn  a  depression  angle  similar  to  that  of  the  optical  photogiaph 
in  part  a. 


must  o.  delected  at  least  on  one  list  (OR)  ot  on  all 
lists  i  \N|  )i.  I  lie  OR  piocedurc  produc  a  Higher 
likelihood  ot  detection  at  tile  expense  of  a  h is^h  false 
alarm  rate.  On  the  other  hand,  the  AND  procedure 
has  a  low  false -alarm  rate  at  the  expense  ot  a  lower 
likelihood  of  detection.  I  lie  next  section  describes  an 
WD  procedure  that  fuses  sensor  data  to  create  a 
i.uigc  passive  histogram,  and  the  following  section 
describes  a  maximum-likelihood  fusion  estimate  lor 
object  detection. 


A’, ///»<  -  l,,i»/t'i  -il\  / listo^hun 

large!  cueing  and  detection  can  be  accomplished 
with  range  data  alone,  with  a  range  onlv  histogram. 
01  with  a  range  passive- 1 U  histogram  (which  is  cre¬ 
ated  In  using  an  .WO  operation  to  fuse  range  and 
passive- 1 R  data  registered  at  the  pixel  level)  |1  j.  I  lie 
rangc-onlv  histogram  is  a  3-0  mapping  of  tile  num¬ 
ber  of  occurrences  of  a  range  value  plotted  m  a  coor¬ 
dinate  svstem  of  cross  range  versus  down-range.  I  he 
hisiogram  is  c  ihiiland  In  scanning  tile  range  image 
pixe  l  In  pixel  and  adding  one  count  to  the  histogram 
hill  that  corresponds  to  the  pixel  a/imiuli  and  the 
pixel  dow n  range  value: 


//,  ..(, v.'.  /•»/« I 


^n.u-.rh. 


w  lu  re 


|  1  il  A’Oc.i/l  =  n/o 

l  '(,/e.e/)  = 

0  otherwise 

and  w  here  A‘l  ,/a  .  <■/)  is  the  absolute  range  ot  that  pixe  l, 
and  /•//«  is  the  specific  range  value.  Peaks  in  the  3  If 
rangc-onlv  histogram  indicate  regions  ot  significant 
vertical  extent,  or  irriiui/ity.  in  the  image,  and  tile 
magnitude  ot  the  peak  represents  the  vertical  surface 
area  of  an  object  in  the  image.  I  lie  propertv  of  verti 
calitv  is  effective  for  finding  targets  in  open  terrain;  it 
produces  a  large  number  of  false  alarms,  however, 
when  applied  m  wooded  areas. 

I  he  passive- 1 R  thermal  intensitv  can  he  used  as  a 
discriminant  to  separate  trees  from  man-made  targets 
that  have  a  significant  positive  thermal  signature.  Pixel 
level  fusion  of  the  range  image  data  and  the  passive- 
IR  image  data  is  possible  because  each  pixel  of  the 
range  and  passiv  e  !  R  images  is  collocated,  l  acii  p.o 
sive-IR  pixel  can  be  registered,  according  to  its  assoc  i 
atecl  range  value,  to  compute  wli.u  we  define  as  a 
/•<///«(  -/>, issivt  -/A  hr'lo"i,/ti/. 

I  lie  range-passive- 1 R  histogram  is  a  3-D  mapping 
of  the  sum  of  the  passive- 1 R  intensities  plotted  in 
cross-range  versus  down  range  coordinates  derived 
from  the  pixel-registered  range  image,  f  igure  (■>  shows 
an  example  ol  a  ratine  pass i\ e- 1 R  histogram.  In  I  igiirc 
(>< a ) .  a  passive  llx  mteiisiiv  histogram  is  calculated  for 


•  KOLODZY 

Multidimensional  Automatic  larger  Recognition  System  Evaluation 


each  column,  which  corresponds  to  a  particular  cross¬ 
range  value  that  uses  both  the  range  image  to  provide 
the  coordinates  tor  the  histogram  and  the  pixel-regis¬ 
tered  passive-lR  image  tor  the  intensity  values.  An 
azimuth  value  is  selected,  and  then  we  scan  the  range 
image  pixel  by  pixel  along  that  azimuth  column,  where 
the  range  value  tor  each  pixel  selects  the  histogram 
range  bin.  The  corresponding  passive-1  R  intensity 
value  in  Figure  6(b)  is  then  added  to  that  histogram 


Azimuth 

□  =  1.2  km  range  (a) 


Azimuth 

(b) 


Range/passive-IR  histogram 


(c) 


FIGURE  6.  Schematic  diagram  of  how  the  range  image  and 
passive-IR  image  are  mapped  into  a  range-passive-IR  his¬ 
togram.  (a)  An  azimuth  value  is  selected,  and  the  range 
image  is  scanned  pixel  by  pixel  along  that  azimuth  column; 
the  range  value  for  each  pixel  selects  the  histogram  range 
bin.  (b)  The  passive-IR  intensity  value  from  the  correspond¬ 
ing  passive-IR  image  column  is  then  added  to  the  histo¬ 
gram  bin.  (c)  In  this  way  a  three-dimensional  range-pas- 
sive-IR  histogram  (cross-range,  range,  passive-IR  inten¬ 
sity)  is  created. 


bin.  In  this  way,  a  three-dimensional  histogram  (cross- 
range,  range,  passive-IR  intensity)  is  created,  as  shown 
in  Figure  6(c).  Peaks  in  the  histogram  indicate  objects 
with  vertical  extent  (i.e.,  trees,  buildings,  and  ve¬ 
hicles)  and  with  sufficient  thermal  contrast  with  re¬ 
spect  to  the  background  (i.e.,  running  engines,  heated 
buildings). 

This  calculation  is  written  as 

H Kp(az,rng)  =  ^  P(az,el)  x  U ( az ,  e/ ) , 

Tl 

where  U(az,el)  is  as  defined  previously  and  P(az,el) 
is  its  processed  passive-IR  intensity.  Peaks  in  the  range- 
passive  histogram  indicate  regions  of  vertical  extent 
that  have  positive  thermal  contrast. 

Figure  7  shows  how  the  range-passive  histogram 
algorithm  was  applied  to  an  1RAR  linescan  scene 
taken  at  Fort  Devens,  Massachusetts.  The  linescan 
scene  contains  the  passive-IR  image  and  laser  radar 
range  image  of  three  trucks  and  a  motor  generator  set. 
The  vehicles  were  not  in  operation;  their  thermal 
signature  is  due  entirely  to  solar  heating.  Figure  8 
shown  the  resulting  range-passive  histogram.  The  three 
largest  peaks  correspond  to  the  three  trucks  in  the 
scene.  For  each  peak,  the  truck  position  is  now  local¬ 
ized  in  cross-range  and  down-range.  This  example 
clearly  shows  the  value  of  fusing  multiple  sensor  do¬ 
mains  at  the  pixel  level  with  an  AND  operation, 
which  improves  the  probability  of  detection  and  low¬ 
ers  the  probability  of  false  alarms. 

Theoretical  Study  of  Active-Passive  Detection 
of  Multipixel  Targets 

Research  into  the  development  of  a  quasi-optimal, 
single-sensor  detection  processor  for  multipixel  laser 
radar  was  done  by  M.  Mark  [2]  and  resulted  in  the 
generation  of  receiver  operating-characteristic  curves 
for  this  processor.  Mark  used  a  generalized-likelihood 
ratio  test  to  estimate  unknown  parameters  for  a  maxi¬ 
mum-likelihood  estimate.  Computer  simulations  with 
benign  synthetic  scenes,  generated  with  uniform  laser 
intensity,  range,  and  passive-IR  values  for  target  and 
background,  were  used  to  provide  performance  mea¬ 
sures.  Recent  extensions  of  this  work  to  multiple  sen¬ 
sor  modalities  (laser  radar  range  and  laser  intensity, 


124  Hll  LINf.niN  llBUNMORV  .JOURim  VOIIIWE  B  NIIWBtR  '  -OH 


•  Koum/Y 


FIGURE  7.  A  scene  containing  three  trucks  and  a  motor  generator  as  imaged  by  (a)  the 
passive-IR  sensor  and  (b)  the  laser  radar  range  sensor  in  the  forward-looking  sensor  suite. 
The  trucks  and  generator  are  clearly  visible  in  the  center  of  the  passive-IR  image.  The  laser 
radar  range  image  depicts  the  objects  as  silhouettes  standing  out  of  the  sloping  terrain  and 
in  the  same  location  as  in  the  passive-IR  image. 


FIGURE  8.  The  result  of  processing  the  data  in  Figure  7 
with  the  range-passive-IR  histogram.  The  down-range  val¬ 
ues  are  color  coded  in  the  same  manner  as  the  laser  radar 
range  image  in  Figure  7.  The  four  highest  peaks  corre¬ 
spond  to  the  three  trucks  in  the  scene.  These  peaks  would 
cue  a  classification  processor  to  a  region  of  interest. 


passive-IR  thermal  intensity)  were  accomplished  by 
S.  Hannon  and  |.  Shapiro  1 3 ] .  I  lie  results  of  these 
computer  simulations,  which  were  later  confirmed  by 
experimental  data,  indicate  that  for  a  specified  operat¬ 
ing  power  such  as  the  probability  of  detection  and  the 
probability  of  false  alarm,  the  required  sensor  signal- 
to-noise  ratios  were  relaxed  for  a  multisensor  mea¬ 
surement  system  over  a  simile  sensor  system. 

-  C 

Figure  0  depicts  the  sensor/target  requirements  for 
a  10-pixel  target  (2  pixels  by  5  pixels)  on  a  1000-pixel 
image  (20  pixels  by  SO  pixels).  I  he  target  si/e  can  be 
sealed  to  simulate  a  tank-sized  vehicle  at  a  distance  of 
approximately  S  km  with  a  sensor  field  of  view  (given 
a  6°  depression  angle)  of  IS, 000  nr.  Figure  0  indi¬ 
cates  the  sensor  requirements  for  detecting  W’<>  of 
tank-si/cd  vehicles  at  S  km  with  a  talse-alarm  rate  of 
1 0  ,  or  of  0. 1  km  Bee  ause  the  simulations  were 
done  on  idealized  scenes,  however,  the  results  are  not 
directly  transferable  to  a  specific  sensor  design.  I  he 
trends  still  indicate  a  reduction  for  either  passive-IR 
signal-to-noise  ratios  (SNR)  or  laser  radar  carricr-to- 
noisc  ratios  (CNR)  when  a  combination  of  two  sen¬ 
sors  is  employed. 


12S 


•  KOI.OD/.Y 

Muliitluncnsitmal  A u umi.it It  large!  A Wugniiiun  System  i valuation 


Sensor/target  requirements 
Probability  of  false  alarm  =  KTJ 
Probability  of  detection  =  .99 


FIGURE  9.  Sensor/target  requirements  for  multipixel  tar¬ 
get  detection  using  the  generalized  likelihood  ratio  test  for 
single  and  multiple  sensor  modalities  to  detect  a  tank-size 
target  at  5  km  with  a  probability  of  detection  of  0.99  and  a 
false-alarm  rate  of  KT3  per  image  or  0.1  km-2  [3|.  A  7-db 
SNR  is  required  for  a  passive-only  sensor  system,  and  a 
12-dB  CNR  is  required  for  a  laser  radar  range-only  sensor 
system.  The  SNR  and  CNR  requirements  are  relaxed  for  a 
combined  passive-range  sensor  system  or  a  passive-range- 
intensity  sensor  system. 


Hybrid  ATR  System 

Wc  have  developed  A  I  R  processing  modules  for  the 
primary  sensor  groups  described  previously;  these 
groups  are  laser  radar  intensity,  range,  passive-IR  ther¬ 
mal  intensity,  and  MMW.  Although  the  individual 
processing  modules  can  vary  among  sensor  groups, 
the  general  processing  structure  has  the  same  sequence 
of  stages:  cleanup,  detection,  segmentation,  feature 
extraction,  invariant  mapping,  and  classification.  The 
general  A  FR  system  was  originally  developed  to  oper¬ 
ate  on  laser  radar  range  and  intensity  imagery,  and  the 
results  presented  in  this  article  are  based  on  this  imag¬ 
ery.  Figure  10  illustrates  the  processing  modules  for 
the  range-imagery  recognition  system;  this  system  is 
described  in  more  detail  below. 

Modular  A  'I'R  System  Concept 

Fhe  unambiguous  range  image  is  First  processed  by 
the  cleanup  stage  to  reduce  data  anomalies  and  en¬ 
hance  the  image.  I  he  cleanup  stage  attempts  to  re¬ 


construct  the  most  probable  input  image  that  would 
produce  the  measured  sensor  image.  This  reconstruc¬ 
tion  clarifies  the  image  appearance,  and  makes  the 
returns  from  the  various  objects  in  the  scene  appear 
more  continuous  and  complete  by  reducing  sensor 
and  scene  artifacts  such  as  dropouts  and  anomalies. 

Next,  the  enhanced  image  is  processed  bv  the  de¬ 
tection  stage  to  identify  correctly  sized  regions  of 
constant  range  as  potential  targets.  Fhe  detection  stage 
extracts  these  regions  from  ...c  background  clutter 
and  removes  the  ground  plane.  Fhe  detected  target  at 
the  output  of  this  stage  is  a  silhouette  consisting  of 
multiple  fragments  and  rough  boundaries. 

Fhe  multiple  fragments  are  combined  by  the  seg¬ 
mentation  stage  into  a  complete,  smooth,  filled  sil¬ 
houette.  Fhe  completed  silhouette  is  then  separated 
by  the  feature  extraction  stage  into  feature  regions 
(e.g.,  barrel,  turret,  body,  and  tread  for  a  tank).  For 
rhis  article,  the  entire  target  silhouette  is  considered 
the  single  feature.  Fhe  silhouette  is  then  mapped  by 
the  invariant-mapping  stage  into  an  abstract  pattern 
that  is  invariant  to  translation,  rotation,  and  scale 
within  the  sensor  field  of  view.  This  invariant  pattern 
is  processed  by  the  classification  stage,  which  initially 
learns  to  cluster  the  invariant  maps  into  groups  and 
then,  after  the  training  cycle,  classifies  the  input  data 
with  respect  to  its  learned  categories. 

Image  Cleanup 

To  provide  adequate  recognition  performance  in  a 
noisy  environment,  the  cleanup  stage  must  be  capable 
of  using  prior  knowledge  to  restore  measured  images. 
We  present  here  an  image- restoration  model  that  quan¬ 
titatively  incorporates  prior  knowledge  of  the  mea¬ 
surement  process  and  scene.  Fhe  model  is  based  on  a 
Bayesian  formulation  using  Markov  random  fields,  as 
introduced  by  S.  Cieman  and  IT  Cieman  [4],  The 
processing  is  massively  parallel  because  the  Markov- 
random-field  assumption  allows  the  image  to  be 
decoupled  into  a  large  number  of  connected  local 
neighborhoods,  each  of  which  can  be  processed  inde¬ 
pendently.  Fhe  local-neighbor  information  is  spread 
out  in  time  such  that  a  global  image  restoration  is 
effected  when  the  image-restoration  system  reaches  a 
steady  stare. 

Real-time  image  restoration  is  possible  bv  using 


126 


Hi  :  : V'lH  \  I  AKHKAIMKi  .idllHNAI 


V Cl i  II VI  f>  NllVHSH  ' 


• KOLODZY 

Mnlutlnni'iteWihll . \iiloiiuuu  hirgt-t  Kmi^iuiiin:  S\>uin  I  idlu.lllun 


FIGURE  10.  The  six  processing  modules  of  the  range  imagery-recognition  system:  cleanup  of 
sensor  artifacts,  detection  of  potential  targets,  segmenting  targets  to  improve  image  characteris¬ 
tics,  extraction  of  relevant  features,  invariant  mapping  of  features  to  remove  translation  and  rotation 
effects,  and  the  classification  of  features  into  target  categories. 


the  model  with  a  massively  parallel  single-instruction 
multiple-data  (SIMD)  computer  such  as  the  Connec¬ 
tion  Machine  or  a  direct  hardware  implementation 
on  a  custom  microprocessor.  A  more  detailed  descrip¬ 
tion  of  the  image-cleanup  process  is  given  in  this  issue 
in  the  article  by  Murali  M.  Menon  entitled  “An  Fffi- 
cient  MRF  Image-Restoration  technique  Using  De¬ 
terministic  Scale-Based  Optimization.” 

We  applied  the  image-cleanup  process  to  a  simple 
synthetic  image  corrupted  with  noise  according  to  a 
measurement  model  described  in  the  literature  [3j. 
I  he  noise  does  not  have  a  Caussian  distribution  and 
is  based  on  realistic  sensor  measurements.  The  oritii- 

O 

nal  noise-free  synthetic  image  has  a  simple  geometric 
shape  at  a  constant  pixel  value,  with  a  background 
that  linearly  increases  in  pixel  value  from  the  top  of 
the  image  to  the  bottom.  Figure  11(a)  shows  the 
uncorrupted  image,  while  Figure  11(b)  shows  the 
image  with  70%  of  the  pixels  corrupted  with  noise. 
The  original  image  has  236  gray  levels,  and  the  noise 


spans  the  entire  range  of  possible  pixel  values.  F.xcept 
for  a  few  discrepancies  at  the  boundary,  the  restora¬ 
tion  shown  in  Figure  1 1  (c)  is  nearly  perfect,  especially 
the  recovery  of  the  sloping  background. 

target  Detection 

The  detection  stage  of  the  A’l'R  processing  system 
extracts  target-like  regions  from  the  enhanced  range 
image  produced  in  the  cleanup  stage.  The  process 
occurs  in  three  phases:  (1)  regions  of  interest  arc- 
selected,  (2)  target-like  objects  are  detected,  and  (3) 
objects  are  extracted  from  the  scene.  Regions  of  inter¬ 
est  are  located  by  using  range-only  or  range-passive- 
IR  histograms,  as  previously  described  in  the  article. 
The  peaks  of  these  histograms  indicate  regions  of 
significant  vertical  extent  (i.c\,  constant  range  with 
varying  elevation),  or  a  significant  thermal  signature 
with  some  vertical  extent.  The  selected  regions  arc- 
searched  for  areas  of  constant  range  that  have  range- 
contrast  with  their  neighbors  and  are  similar  in  size 


i  r 


FIGURE  11.  The  effect  of  processing  a  synthetic  laser  radar  range  image  of  a  geometrical  object  v.ith  the  imago-cl.-ain.g 
neural  network,  (a)  The  original  noise-free  image,  (b)  the  image  with  70°  of  the  pixels  corrupted  with  noise,  anti  p. )  the 
restoration  of  the  ongmal  image  fiom  the  corrupted  image  with  only  a  few  discrepancies  at  the  image  boundai . 


1  In >tli  m  absolute  height  and  width)  to  a  target  ot 
int civ'' t .  I  Ik  target  like  region  is  then  separated  Irom 
the  has  kground  In  selecting  onlv  the  pixels  u  it h  that 
ranue  value.  I  he  object  is  then  extracted  Irom  this 
selec  ted  in  lane  In  computing  ,md  remm  ini;  the  ^mniicl 
plane. 

1  inure  I  dial  shows  the  initial  ratine  image  ot  an 
M  tS  tank  at  “(HI  m  and  the  siihseijuent  detection 
result  that  was  formed  In  using  the  previously  de¬ 
scribed  rangc-onlv  histogram  and  remov  ing  the  ground 
plane.  I  inure  ld(b)  slums  the  M  tS  tank  after  the 
cleanup  stage  and  detection  stage  ot  processing. 


(a) 


"Wt  i//. uimi 

I  Ik  segmentation  stage  ot  tlu  \  I  l\  sv  stem  smooths 
the  boundaries  and  completes  the  fragments  ot  the 
cleteeted  potential  target.  I  be  boundary -contoui  svs 
tern  (IU  Si.  a  subsystem  ot  a  visual  processing  theorv 
developed  In  S.  (iiossbcrg  and  I.  Mingolla  ,  S  .  is 
used  to  generate  the  perceived  senmentation  ot  the 
potential  target,  w  ith  respect  to  illuminance  contrasts. 
I  lie  IU  S  system  consists  ot  two  stages:  an  oriented- 
eontrast  ((H  )  tiller  and  a  cooperative-competitive 
((  (  )  loop.  IheOC  filter  measures  local  luminance 


(b) 


FIGURE  12.  (a)  The  initial  range  image  of  an  M48  tank;  (b)  the  subsequent  detection  result  that  was 
formed  by  using  the  range-only  histogram  and  removing  the  ground  plane. 


•  KOI  ODZY 

Multulnncnuonul  Auunnatu  l urge t  kt\ opinion  \y sttw  k  valuation 


differences,  or  edges,  within  an  image  at  a  number  of 
different  orientations.  This  filter  models  the  orienta¬ 
tion-selective  cells  discovered  by  D.  Hubei  and  T. 
Wiesel  [6]  in  the  human  visual  system.  These  ori¬ 
ented  edge  strengths  are  then  allowed  to  compete  and 
cooperate  with  one  another  in  the  CC  loop  to  gener¬ 
ate  the  perceived  boundary  contours. 

The  four  layers  of  the  CC  loop  consist  of  two 
competitive  layers,  one  cooperative  layer,  and  a  feed¬ 
back  layer.  The  first  competitive  layer  thins  and  sharp¬ 
ens  boundaries  within  the  image  by  allowing  compe¬ 
tition  for  dominance  in  the  final  boundary 
segmentation  between  neighboring  edge  strengths  of 
the  same  orientation.  The  second  competitive  layer 
straightens  jagged  or  noisy  boundaries  by  allowing 
competition  between  edge-contrast  strengths  with  dif¬ 
fering  orientations  at  the  same  location.  The  coopera¬ 
tive  layer  completes  and  connects  boundaries  by  al¬ 
lowing  edges  of  like  orientation  to  cooperate  over  a 
distance  in  the  image.  The  feedback  layer  introduces 
into  the  system  any  new  boundaries  formed  by  the 
cooperative  layer. 

The  OC  filter  is  implemented  by  convolving  a  set 
of  orientationally  tuned  digital  filters  with  the  input 
image.  The  CC  loop  is  modeled  by  using  a  set  of  four 
coupled  nonlinear  differential  equations  for  each  ori¬ 
entation  and  location  within  an  image.  Input  to  the 
CC  loop  is  static;  therefore,  the  boundary  is  com¬ 
plete  when  the  system  of  differential  equations  is  in 
equilibrium. 

The  BCS  algorithm  has  been  previously  applied  to 
laser  radar  imagery  as  reported  by  Kolodzy  et  al.  [7] 
and  by  E.  Van  Allen  [8,  9].  Figure  13  shows  an 
example  of  BCS  processing  on  the  range  image  of  an 
M48  tank.  The  input  range  silhouette  in  the  upper 
left  of  the  figure  is  the  input  to  the  segmentation 
stage.  The  range-silhouette  image  is  sampled  to  ob¬ 
tain  the  oriented  contrast  strengths  by  using  the  OC 
filter  in  each  of  twelve  orientations.  These  oriented 
contrast  strengths  are  then  processed  by  the  CC  loop 
to  produce  twelve  new  images,  which  are  compressed 
into  a  single  image  by  using  one  of  two  methods;  (1) 
compute  the  maximum  contrast  strength  of  any  ori¬ 
entation  at  each  pixel  location  (upper  right  of  Figure 
13),  or  (2)  sum  the  contrast  strengths  across  all  the 
orientations  at  each  pixel  location  (lower  left  of  Figure 


13).  The  compressed  image  is  then  filled  to  form  a 
completed  and  smoothed  silhouette  of  the  potential 
target  for  classification.  The  specific  filled  image  shown 
in  the  lower  right  of  Figure  13  is  the  result  of  using 
the  summing  method  for  compressing  the  results  of 
the  CC  loop. 

Feature  Extraction 

In  general,  the  filled  and  smoothed  image  provided 
by  the  segmentation  stage  of  the  ATR  system  is  then 
used  to  extract  relevant  features  for  classification.  Many 
different  feature  domains  '-mages  or  vectors)  such  as 
image  geometry,  object  p  :  o,,  fractal  dimensions,  dis¬ 
tance  of  hot  spots  from  the  central  locations,  and  the 
Hough  transform  can  be  used  and  are  part  of  on¬ 
going  research. 

In  particular,  model-based  systems  have  been  de¬ 
veloped  at  Lincoln  Laboratory  to  parse  images  into 
geometric  features  (such  as  circles,  squares,  rectangles) 
and  subsequently  classify  those  features  into  target 
features  (such  as  barrel,  hull,  turret).  These  model- 
based  systems  are  discussed  in  the  article  by  J.  Verlv  et 
al.  entitled  “Machine  Intelligence  Technology  for  Au¬ 
tomatic  Target  Recognition”  [10].  The  use  of  model- 
based  systems  for  feature  extraction  in  this  ATR  sys¬ 
tem  has  been  evaluated  previously  by  D.  Dudgeon  et 
al.  [11],  and  they  are  not  discussed  further  in  this 
article.  For  our  purposes,  the  pixel  image  provided 
by  the  segmentation  stage  is  used  as  the  feature  for 
classification. 

Invariance  Mapping 

The  segmented  silhouette  is  spatially  mapped  to  elimi¬ 
nate  translation,  rotation,  and  scale  variations  prior  to 
classification.  An  invariant  silhouette  is  therefore  built 
directly  into  the  classifier  memory  to  form  a  single 
compact  representation  for  the  target.  This  invariance 
reduces  the  number  of  stored  patterns  from  one  pat¬ 
tern  for  each  of  several  terrain  angles  and  ranges  to  a 
single  stored  pattern,  which  reduces  memory  require¬ 
ments  and  search  times  and  improves  efficiency.  In 
variance  can  be  obtained  by  using  the  following  pro¬ 
cess:  (1)  locate  the  segmented  silhouette  in  the  field  of 
view,  (2)  detect  the  silhouette  edges,  and  (3)  spatially 
map  the  silhouette  edges.  The  resultant  abstract  pat¬ 
tern  has  the  desired  invariance  and  is  processed  di- 


VOLUVt  b  MTv'BIB  \  '993  THt  LUCOlN  1  B 0 R A  ■  QRi  .:0!IR\«L 


129 


•  m>ioi>/x 


FIGURE  13.  Segmentation  of  a  laser  radar  image  of  an  M48  tank  with  the  boundary-contour  system  (BCS) 
showing  smoothed  boundaries  and  connected  segments.  The  image  m  the  upper  left  is  the  original  segmented 
innge  silhouette.  The  tesult  of  applying  till'  BCS  using  the  maximum-contrast  edge-strength  method  is  shown  m 
the  up pei  i  ig lit  and  the  insult  using  the  summed  -  contrast  edge -st length  method  is  show  n  m  the  low ei  left  The 
summed  contrast  edge-stiength  insult  was  then  filled  and  is  shown  m  the  low-ei  light. 


Illtlv  In  till’  classifier  ill  ilk’  IlCXI  stags'. 

I  Ik  largei  sillnuictic  is  loi.iicd  in  ilic  plane  of  lire 
liclil  nl  \  leu  In  i  ali  ul.it  ing  a  position  weighted  sum. 
or  1 1  iH'iii,  1.  ol  its  pixe  l  iniensiiies.  I  lie  iinttoid  ol  the 
segmented  sillioikt  I  c  is  then  iiscil  as  the  origin  in  die 

sp.tn.i!  mappipetli.il  follows. 

I  lie  sillioueile  is  next  detcileil  lor  edge  siiengtlis. 

I  Ik  edge  del  is  lion  algonllim  uses  mm  r.lst  sensinve 
01  lent  eel  ellipik.il  lenpiive  I  kills.  In  rliis  appio.kli. 
ilk  icicptiu-  tk'lils  aie  passed  mil  ilk-  image'  to  sum 
i  Ik  pixel  eliengv  pieseiit  in  (lie  area  icnieieil  .iiomiil 
e.kli  pixi'l.  I  lie  m.l|or  axes  ol  iln-  ellipiis.il  tiiepiive 
tielils  aie  oiieiitiil  m  as  mam  as  twelve  ill i is  lions  10 
e  ale  1 1 1  a  1 1  edge  sucngih  as  a  luiklion  of  01  is  niaiion. 

Ilk  oi  1 1  pi  1 1  al  i.k  li  pixel  ill  ills  edge  image  is  lik¬ 


ClIIfMII 

\  allte  ol 

die  si  n  uigi  si  in  i  pt  iv  c  1  lelil 

oruni.i 

non  al  dial  pixe 

1.  Bi  s  aiiu  edg.i-  di'lc a  Uon  In  i 

IVl  L  pt  W  K. 

held  pi  oi  i  sung 

on impuiauonalK  intensive 

.  1 1 K' IV  is 

a  n  ads 

oil  Ik  I  vs . 

t  en  oi  n  iii.il  it  m  ,u  i  m  .u  v  and 

pi  t  K  t.  ss 

ing,  link'.  1  Ik-  uoik  ilisi  i  ilvil  in  tins  anisic  nas  sails 
f.kioriK  aiioir.plisliei!  In  using  lout  oi  ii  niai lorn  loi 
ilk-  renpiixi-  lields. 

I  III'  sp  anal  mapping  fuiiiiion  pun  ides  iaig.ii  ioi.i 
i  ion  and  si  ale  mx  ,u  i.uiic •  \\  uliin  ills  plans  ol  die  I  is  Id 
ol  xiexx.  I  lie  luiiiiion  used  in  this  work  is  a  log  polai 
mapping  ol  ills-  iilgi  siieiigdi  image  ahoui  Us  nil 
irokl.  as  slioun  m  laNi  I.  I  lie  lop  polai  mapping,  is 
1'iolopk alls  inspired  In  ilk  visual  held  mapping  ol 
ilk  hum. in  visual  milcx.  as  di  nioiisi  i  au  d  In  I 
Si  livv.u  1/  Id  . 

Ilk-  log  polai  mapping  n.uistoims  ilk  edg.i  dc 
lie  i  i’ll  range  slisc  io  a  log  i.iduis.  polai  angli  soonii 
nail-  svsii  m  I'v  using,  ilk-  1 1  n n on!  ol  ilk  silliouilti  as 
1 1 s  iiliieiisi  poim.  IVuimmmg  ilk  loi.uion  ol  du 
image  niiuokl  is  miponani  IvsUii'C  du  .uaiiao  ol 
die  log.pol.u  mapping  n  otu  n  Inglilv  siiisinvi  io 
1 1  lots  m  die  position  o!  ills  niiuokl.  Imts  u-mg 
noisv  smiul.iied  un.igiiv,  liowsvei.  Ii.iv,  siiovvr-  du 


•  KOLODZY 

Multidimensional  Automatic  large!  Recognition  System  Evaluation 


Table  4.  Invariant  Mapping  of  Silhouette  Edge  Strengths  for  Translation,  Rotation, 
and  Scale  Invariance  in  the  Sensor  Field  of  View 


Translation  Pixel  centroid  locates  origin  for 

mapping  log  radius  and  polar  angle 


Rotation  Rotation  in  field  of  view  is  mapped 
to  shift  in  polar  angle 


Polar  angle 


Scale  Scale  (or  range)  is  mapped  to 

shift  in  log  radius 


centroid  calculation  and  log-polar  mapping  to  have 
robust  behavior. 

The  log-polar  mapping  around  the  centroid  of  the 
target  maps  rotation  in  the  field  of  view  to  a  shift  in 
polar  angle,  while  it  maps  range  to  the  target  as  a  shift 
in  log  radius.  This  mapping  is  insensitive  to  rotation 
and  scale  variations;  cross-correlation  with  an  un¬ 
rotated,  unsealed  log-polar  mapping  gives  an  estimate 
of  the  amount  of  rotation  and  scaling  present  in  the 
detected  silhouette. 

Another  method  for  making  the  log-polar  map¬ 
ping  invariant  to  rotation  and  scale  in  the  field  of 
view  is  to  calculate  the  magnitude  of  its  Fourier  trans¬ 
form.  The  sh.it  property  of  the  Fourier  transform 
eliminates  the  rotation  and  scaling  shifts  of  the  log- 
polar  mapping  by  treating  the  mapping  as  a  periodic 
function,  as  reported  for  laser  radar  range  imagery  by 
Kolodzy  et  al.  [7].  While  this  method  has  merit,  it 
will  not  be  discussed  further  here;  it  is  the  subject  of 
other  research  [13]. 

Classification 

In  the  final  stage  of  the  ATR  system,  a  neural  network 
is  used  to  classify  the  abstract  invariant  maps  into 
potential  target  categories.  The  adaptive  resonance 
theory  (ART)  developed  by  G.  Carpenter  and  h. 


Grossberg  [14]  defines  a  class  of  unsupervised  neural 
network  classifiers  that  cluster  an  ^/-dimensional  in¬ 
put  vector  into  a  finite  number  of  stable  categories. 
This  clustering  is  a  necessity  if  large  training  sets  are 
to  be  used. 

Supervised  networks  and/or  model-based  systems 
require  exact  knowledge  of  the  target,  or  ground  truth , 
for  each  exemplar.  For  most  large  systems,  thousands 
(or  millions)  of  training  frames  would  need  to  be 
ground  truthed,  which  is  a  daunting  if  not  impracti¬ 
cal  task.  The  ART-2  network,  which  is  illustrated  in 
Figure  14,  is  basically  a  two-leve!  correlation  classifier; 
this  algorithm  has  been  discussed  by  R.  Lippmann 
[15]  and  Menon  and  Kolodzy  [16]  to  be  similar  to 
the  //-means  clustering  algorithm. 

The  ART-2  network  is  different  from  early  ART 
structures  because  it  is  designed  to  classify  analog, 
rather  than  binary,  input  patterns  [17].  This  analog 
capability  requires  a  robust  structure  that  pays  strict 
attention  to  memory  stability.  The  ART-2  network 
classifies  and  stores  patterns  in  the  following  manner. 
The  first  layer  (FI)  normalizes  the  input  with  respect 
to  the  feedback  signal  from  the  second  layer  (F2), 
becoming  a  short- term  memory  (S  I  M)  trace.  This  trace 
activates  nodes  in  the  second  layer  proportional  to  the 
magnitude  of  its  correlation  with  the  corresponding 


131 


• KOLODZY 

Multu/iiiuiiiioiu/  AutuiHiiin  larger  Kitugniiiu’i  S ).■/>  '«  ixahiaiwn 


•  Generalized  mean  processing  nodes 

•  Normalization  nodes 


FIGURE  14.  Processing  diagram  of  the  ART-2  neural  network.  The  input 
image  is  presented  and  stabilized  by  the  node-level  processing  in  the  FI 
layer;  the  result  is  then  correlated  with  stored  long-term  memory  (LTM) 
patterns  in  the  F2  layer.  If  the  resultant  correlation  is  not  large  enough 
with  respect  to  the  vigilance  parameter,  a  reset  signal  is  propagated  as 
feedback  down  to  the  FI  layer.  If  none  of  the  LTM  patterns  are  sufficient, 
then  c  new  LTM  pattern  is  formed. 


stored  memory  patterns,  or  long-term  memory  (LTM) 
traces.  If  the  degree  of  match  between  the  normalized 
input  STM  trace  and  the  LTM  trace  associated  with 
the  most  highly  activated  node  in  F 2  passes  a  vigi¬ 
lance  parameter,  the  STM  trace  in  FI  is  learned  onto 
the  LTM  trace,  thus  storing  the  differences  between 
these  memory  traces.  Should  a  mismatch  occur,  a 
reset  signal  causes  the  input  pattern  to  select  another 
LTM  category.  If  no  existing  LTM  category  can  be 
found  that  matches  the  input  pattern,  a  new  category 
is  created,  which  illustrates  the  ability  of  the  ART-2 
network  to  respond  to  a  novel  signal.  Generally,  simi¬ 
lar  patterns  are  categorized  together  because  of  high 
interpattern  correlation,  and  these  patterns  continu¬ 
ally  activate  the  same  category  node  in  F2. 

The  final  step  of  the  classifier  training  is  to  associ¬ 
ate  target  labels  with  LTM  traces.  Fach  LTM  trace  for 


an  individual  target  is  provided  a  label  unique  to  that 
target.  Multiple  LTM  traces  tor  an  individual  target 
are  formed  because  of  either  different  views  or  statisti¬ 
cal  variability  of  the  target.  For  example,  a  tank  at  a 
head-on  perspective  looks  different  from  a  tank  at  a 
broadside  perspective,  and  thus  would  form  two  dif¬ 
ferent  categories.  Also,  at  a  low  SNR  the  signal  could 
change  significantly  enough  to  cause  the  classifier  to 
form  a  new  category  if  it  is  presented  with  a  new  noise 
structure. 

Interpretation  of  Results 

The  performance  of  an  A  I  R  system  can  be  indicated 
either  by  the  score  of  each  individual  processing  mod¬ 
ule  or  by  the  overall  system  score.  This  article  uses  the 
overall  system  score  as  a  measure  of  results,  with  a 
higher  concentration  on  the  capabilities  of  the  classi- 


132 


'  ol 


.1  \  ,  SRIRMLP 


R\& 


vi);  M v [  *:  viVBffi  •  ' 


•  KOLODZ 

SiuludimcH'ioiutl Aitiotnain  I. />>’*•/  AVt og>iinon  ^\>u>n  kvahuitum 


far.  for  supervised  classifiers,  the  performance  is  com¬ 
monly  measured  bv  the  number  of  correct  responses 
ol  the  system  when  it  is  given  a  test  set  of  input 
images,  [’he  A  I  R  system  presented  here  incorporates 
an  unsupervised  classifier,  which  uses  a  larger  variety 
ol  performance  measures.  This  article  uses  a  scoring 
method  based  on  the  number  and  population  distri¬ 
bution  of  categories  created  by  the  classifier  during 
training. 

Unsupervised  classifiers  are  generally  clustering  al¬ 
gorithms  that  group  input  feature  vectors  into  a  finite 
number  of  categories.  A  user-defined  distance  metric 
is  used  to  determine  whether  an  input  vector  is  to  be 
clustered,  or  Hunched,  with  an  existing  categorv.  The 
number  of  categories  produced  (given  a  specific  train¬ 
ing  set)  indicates  the  ability  of  the  classifier  to  general¬ 
ize.  A  classifier  responding  with  more  than  one  cat¬ 
egory  for  an  object  is  not  unreasonable  if  the  features 
the  AI  R  is  extracting  change  significantly,  for  the 
ATR  system  presented  here,  this  change  in  features 
occurs  for  the  log-polar  map  when  the  vehicles  are 
rotated  out  of  plane.  A  classifier  that  requires  only  a 
few  categories  to  perform  recognition  is  desirable. 

Two  measures  of  a  classifier’s  ability  to  generalize 
are  currently  being  used:  (1)  the  number  of  categories 
required  for  a  given  training  set,  and  (2)  the  number 
of  populated  categories  formed  for  the  same  training 
set.  The  differences  between  these  two  measures  are 
found  in  the  interpretation  of  sparsely  populated  clus¬ 
ters  or  categories.  For  example,  if  1 00  inputs  produces 
five  categories  populated  by  95,  2,  1,  I,  and  1  ex¬ 
amples,  respectively,  then  either  five  separate  catego¬ 
ries  or  one  single  category  with  five  incorrect  re¬ 
sponses  are  necessary. 

I  he  trade-offs  of  these  measures  are  identical  to 
those  for  fielded  AI  R  systems:  performance  versus 
hardware  requirements.  If  every  vehicle  needs  to  be 
recognized,  then  all  possible  variations,  including  those 
categories  individually  populated,  or  outliers  (those 
variations  whose  characteristics  are  rarely  viewed),  are 
required  to  be  modeled  and  retained  in  the  ATR 
system.  It  is  possible,  as  shown  by  the  100  input 
examples  above,  that  a  significant  reduction  in  the 
hardware  requirements  (i.e.,  memory)  of  the  system 
can  be  obtained  by  allowing  a  certain  reduction  in 
recognition  capability. 


hauduations  f  'sing  I. user  Radar  Imagery 

We  have  investigated  the  ability  of  this  AI  R  system  to 
classify  laser  radar  range  imagery  of  various  military 
targets  correctly.  1  his  system  has  been  tested  on  a 
limited  amount  of  imagery  obtained  with  ground- 
based  sensors  built  by  the  Opto- Radar  Systems  group 
at  Lincoln  Laboratory.  The  results  of  these  tests  are 
presented  below. 

1  he  full  capabilities  (and  deficiencies)  of  an  AI  R 
system,  however,  must  also  be  determined,  and  this 
determination  is  possible  only  through  exhaustive  test¬ 
ing  requiring  large  amounts  of  sensor  data.  Many 
conditions  can  be  tested  to  determine  the  capabilities 
of  the  AIR  system;  we  used  three  conditions:  (!) 
CNR,  (2)  out-of-plane  rotation,  and  (3)  number  of 
pixels  on  target.  Unfortunately,  the  amount  of  sensor 
data  required  to  test  these  three  conditions  thoroughly 
by  using  real  sensor  data  is  prohibitive  in  both  time 
and  cost.  The  use  of  synthetic  imagery  to  place  bounds 
on  system  capabilities  is  the  logical  alternative. 

Synthetic  laser  radar  range  target  imagery  was  gen¬ 
erated  by  using  Environmental  Research  Institute  of 
Michigan  (F.RIM)  wire-frame  models  of  a  variety  of 
military  and  non-military  vehicles.  Background  im¬ 
agery  was  generated  by  using  a  flat  ground  plane 
projected  with  the  attack  angle  of  the  sensor.  Perfect 
object  extraction  from  the  ground  plane  was  assumed 
for  this  study.  Sensor  statistics  (e.g.,  noise)  were  added 
by  using  the  laser  radar  range  and  intensity  models 
developed  by  1  Shapiro  et  al.  [18,  3 , 4).  This  proce¬ 
dure  was  used  to  generate  targets  for  the  three  test 
conditions  listed  above. 

Ground-Based  Sensor  Data  and  Results 

In  1981  at  Camp  Edwards,  Massachusetts,  the  Opro- 
Radar  Systems  group  recorded  a  large  database  of 
laser  radar  imagery  of  three  vehicles — an  M-48  tank, 
an  M-l  13  ar;'u,red  personnel  carrier  (A^C),  and  an 
M-l  10  howitzer.  1  hese  vehicles  were  recorded  at  five 
orientations  with  fi\"  range  backgrounds  by  using  a 
transportable  ground-based  laser  radar  sensor  that  was 
the  forerunner  of  the  airborne  IRAR  system.  This 
ground  based  sensoi  allowed  us  to  create  a  versatile 
database  for  testing  A  I  R  system  performance.  Each 
of  the  five  background  scene*  consisted  of  sky,  tr"t's, 


133 


•  KOLODZY 

MiiltuliDitihioihil  Aunnujiti  target  Reeugnithm  System  tvaltuitum 


Table  5.  Classification  Results  for  Three-Target  Database  of  18  Images 


Number  of  C ategories  False 

Formed  Alarms 


BCS  filled  silhouettes  5 

BCS  maximum  edge  strengths  5 

BCS  summed  edge  strengths  4 

Raw  extracted  target  silhouette  3 


or  hillside,  which  we  created  bv  changing  the  location 
of  the  ground-based  sensor  relative  to  the  target. 

We  selected  an  1 8-frame  image  subset  of  the  Camp 
Edwards  database  and  processed  this  image  subset 
through  the  ATR  system.  This  image  subset  consisted 
of  three  frames  of  three  targets  at  750  m  and  1000  m 
in  range.  The  750-m  imagery  had  a  skv  background 


ART -2 
Vigilance 

1  0.785 

0  0.694 

0  0.718 

2  0.790 

that  provided  infinite  range  contrast  between  the  tar¬ 
get  and  background.  The  1000-m  imagery  had  a 
hillside  background  that  had  almost  no  range  con¬ 
trast  because  of  the  high  depression  angle  between  the 
sensor  and  the  target;  many  pixels  in  this  imagery 
were  only  one  range  count  different  from  the  target. 
The  detection  algorithm  of  the  AI  R  system  lo- 


RANtt  IMAGl  PRUT  LSSING 


A  R  7  7  mi  WORK  I  IM  I  R  A  f  L  s 


e ;  _  -c..f  "  'E  :>?  :;;n  '  ■  '7  s  'W-i-In;. 


^V\  rrv 


FIGURE  15.  Classification  of  laser  radar  range  imagery  into  stable  recognition  categories  by  using  the  ART-2 
neural  network.  The  range  silhouette  is  shown  for  three  input  images — a  tank,  armored  personnel  carrier 
(APC),  and  howitzer — followed  by  the  resultant  image  from  the  segmentation  stage.  The  edge  image  is  then 
computed,  followed  by  the  result  of  the  log-polar  mapping.  The  right  side  of  the  figure  shows  the  three  LTM 
patterns  with  a  red  box  outlining  the  matched  LTM  category  for  the  corresponding  input.  Note  that  the  LTM 
patterns  are  not  identical  to  the  input  log-polar  patterns,  because  they  are  an  aggregate  of  all  the  inputs 
classified  with  an  individual  LTM. 


134 


*  KOLODZY 

Multidimennonal  Automatic  I  argi  l  Recognition  System  b.  valuation 


cared  and  extracted  100%  of  the  targets  in  the  test  set. 
l  his  result  was  not  unexpected  for  the  750-m  imag¬ 
ery.  There  was  significant  range  contrast  in  the  scene, 
so  verticalitv  measurements  alone  could  be  used  for 
detection  and  the  size  filter  was  not  required  to  ex¬ 
tract  the  target.  In  the  1000-m  imagery,  however,  the 
background  was  often  only  one  range  count  different 
from  the  target,  which  required  the  size  filter  to  ex¬ 
tract  accurately  the  region  of  interest  defined  as  the 
target.  A  detection  rate  of  100%  for  the  1000-m 
targets  demonstrated  the  robust  behavior  of  the  de¬ 
tection  stage  of  the  ATR  system. 

The  ART-2  neural-network  classification  stage  of 
the  ATR  system  properly  classified  95%  of  the  targets 
into  five  stable  recognition  categories,  as  listed  in 
Table  5.  Sixteen  targets  formed  four  categories  (spe¬ 
cifically,  six  tanks,  five  APCs,  three  750-m  howitzers 
and  two  1000-m  howitzers),  one  1000-m  howitzer 
formed  its  own  category,  and  one  1000-m  APC  was 
erroneously  classified  as  a  tank  and  counted  as  a  false 
alarm.  This  performance  is  acceptable  after  careful 
examination  of  the  imagery.  The  tanks  and  APCs 
formed  relatively  consistent  invariant  patterns  for  clas¬ 
sification.  The  detected  howitzers,  however,  were  not 


classified  consistently  because  the  detection  stage  ei¬ 
ther  included  parr  of  the  ground  plane  or  it  removed 
part  of  the  target  body. 

Figure  15  shows  a  sample  classification  result.  The 
images  in  the  left  column  of  the  figure  are  the  de¬ 
tected  silhouettes  determined  by  using  the  range  im¬ 
agery  of  a  tank,  APC,  and  howitzer  from  the  detec¬ 
tion  stage  of  the  ATR  system.  These  silhouettes  are 
processed  by  the  segmentation  stage,  the  edge  strengths 
are  computed,  and  then  the  edge-strength  images  are 
transformed  into  the  log-polar  domain,  as  shown  in 
the  next  three  columns  of  the  figure.  The  right  side  of 
the  figure  indicates  the  three  LTM  traces  created  by 
the  ATR  system  after  processing  the  nine  750-m  im¬ 
age  frames.  The  red-box  highlight  indicates  the  LTM 
trace  with  which  that  particular  input  image  on  the 
left  is  matched. 

Classification  performance  was  investigated  for  a 
set  of  variations  to  the  baseline  ATR  system.  The 
baseline  system  uses  the  edge  images  computed  from 
the  filled  silhouettes  produced  by  the  segmentation 
stage  as  the  input  to  the  log-polar  map.  The  filled 
silhouettes  are  produced  by  using  the  summed-edge 
compression  method.  The  variations  investigated  were 


Table  6.  The  Effect  of  CNR  on  the  Number  of  Categories 
Formed  for  ATR  System* 


CNR(dB) 

Percent 

Anomalies 

With 

Image  Cleanup 

Without 
Image  Cleanup 

100 

0.0 

8 

8 

35 

0.2 

8 

8 

30 

0.6 

8 

17 

25 

1.9 

8 

26 

19 

7.3 

8 

16 

13.9 

8 

13 

25.2 

8 

10 

42.3 

8 

7 

63.1 

26 

*  Tests  included  eight  vehicles  (jeeps,  trucks,  armored  personnel  carriers,  and  tanks) 
both  with  and  without  image  cleanup. 


vomvt  s  utvsii ' 


;»t  Li'.r.ir.v  ushra'or.  jo.-r\s 


135 


• KOLODZY 

Multidimensional  Automata  Target  Recognition  System  Evaluation 


the  use  of  the  maximum-edge  image,  the  summed- 
edge  image,  and  the  edge  image  computed  from  the 
target  silhouette  produced  by  the  detection  stage.  Each 
of  the  variations  is  related  to  a  reduction  of  processing 
by  either  eliminating  part  of  or  the  entire  segmenta¬ 
tion  stage. 

We  describe  the  results  of  these  variations  to  the 
baseline  system  in  terms  of  the  number  of  categories 
formed  and  the  number  of  false  alarms  (false  classifi¬ 
cations)  produced.  The  goal  is  to  reduce  both  the 
number  of  categories  (i.e.,  produce  better  generaliza¬ 
tion  of  the  data)  and  the  number  of  false  alarms.  The 
results  given  in  Table  5  indicate  that  both  the  maxi- 
mum-edge-strength  image  and  the  summed-edge- 
strength  image  eliminate  the  false  alarms  while  the 
summed-edge-strength  image  also  reduces  the  num¬ 
ber  of  categories.  The  target-silhouette  image  further 
reduces  the  number  of  categories  while  sacrificing 
false-alarm  performance. 

These  preliminary  results  indicate  that  the  classifi¬ 
cation  results  are  sensitive  to  the  algorithms  used  in 
the  processing  stages  prior  to  the  classification  stage. 


Additional  results  of  tests  using  a  larger  database  are 
required  before  we  can  conclude  that  summed  edge 
strengths  should  be  used  exclusively  as  the  input  to 
the  log-polar  map. 

Effect  of  CNR  on  Synthetic  Broadside 
Target  Recognition 

In  the  first  test  we  evaluated  the  effect  of  CNR  on  the 
recognition  of  broadside  targets.  We  peiformed  two 
individual  experiments  to  determine  the  number  of 
categories  formed  without  image  cleanup  and  the 
number  of  categories  formed  with  image  cleanup. 
Table  6  shows  the  results  of  recognizing  eight  broad¬ 
side  vehicles  (two  jeeps,  two  trucks,  two  tanks,  and 
two  APCs)  that  are  synthetically  generated  with  a 
sensor  of  100-/<rad  angular  resolution  imaged  at  a 
distance  of  750  m. 

Without  image  enhancement,  a  high  ART-2  classi¬ 
fier  vigilance  value  was  required  to  separate  the  eight 
vehicles.  This  high  value  forced  the  classifier  to  form 
multiple  categories  for  each  vehicle  at  a  CNR  value  of 
30  dB.  The  same  result  is  obtained  when  image  en- 


TANK  1 


As/'S; 


/V/\ 


TANK  2 


r\A 


A  PC  2 


^27 


r\ A 


FIGURE  16.  Log-polar  maps  of  tanks  and  APCs  rotated  out  of  plane.  The  log-polar  map  of  the  two  tanks  are  similar 
for  the  broadside  and  near-broadside  views  but  different  for  the  head-on  view.  The  same  similarities  and  differ¬ 
ences  exist  between  the  log-polar  maps  for  the  two  APCs. 


136  JHE  LINCOLN  IABORAIORV  JOURNAL  VOLUME  6  NUMBER  t  1993 


•  koion/Y 

.\tulti<luin  nuondl  Automata  htrgft  Rixo^mliou  S v>/t  w  l  i  dhuitmn 


(a)  (b)  (c)  (d) 

FIGURE  17.  Approximate  angular  extent  of  each  category  for  recognition  of  log-polar  maps  of  eight  vehicles  with  out-of¬ 
plane  rotation;  the  vehicles  are  (a)  two  jeeps,  (b)  two  trucks,  (c)  two  tanks,  and  (d)  two  APCs.  A  total  of  31  categories  are 
formed.  Each  category  and  its  angular  extent  is  depicted  by  the  shaded  patterns  in  the  figure.  Each  vehicle  requires  only  a 
single  category  from  the  broadside  view  up  to  45°  of  head  on  or  greater.  The  majority  of  the  categories  are  in  the  last  1 5°  from 
near  head  on  to  head  on  because  the  log-polar  maps  change  the  greatest  in  that  region. 


hancement  is  included,  in  the  form  of  the  Bayesian 
preprocessor,  at  a  CNR  value  of  7  dB.  A  typical 
operational  sensor  value  of  19  dB  at  a  distance  of 
1000  m  indicates  that  image  cleanup  is  a  necessity. 
For  further  details  of  the  experiment  and  results,  see 
the  report  by  S.  Rak  [19]. 

Out-of-Plane  Rotation  Recognition 

A  second  test  was  performed  to  provide  insight  into 
the  number  of  independent  categories  necessary  to 
distinguish  eight  vehicles  rotated  out  of  plane  from 
broadside  to  head  on  (a  90°  rotation)  [20].  When 
matched  filters  are  used  for  recognition,  we  com¬ 
monly  create  filters  for  every  5°  of  arc.  This  test  was  to 
provide  experimental  evidence  for  the  number  of  cat¬ 
egories  necessary  for  recognition.  Again,  we  used  the 
same  AI  R  system  with  the  log-polar  maps  that  we 
used  with  the  ground-based  sensor  data. 

A  visual  depiction  of  the  information  passed  to  the 


classifier  indicates  that  input  to  the  log-polar  maps 
from  broadside  to  50°  of  head  on  are  similar,  whereas 
the  maps  near  head  on  change  radically.  Figure  16 
shows  the  log-polar  maps  for  two  tanks  and  two 
APCs  at  broadside,  50°,  and  head-on  orientations. 
Visually,  Figure  16  indicates  that  more  categories  are 
necessary  for  the  near  head-on  orientations  while  only 
a  few  categories  are  needed  for  the  near-broadside 
orientations. 

The  same  test  was  performed  with  the  eight  ve¬ 
hicles  rotated  from  broadside  to  head  on  in  1°  incre¬ 
ments,  which  created  90  inputs  per  vehicle.  The  720 
aggregate  inputs  were  then  used  to  train  the  classifier, 
which  determined  that  only  31  categories  were  neces¬ 
sary  to  distinguish  the  eight  vehicles  at  any  orienta¬ 
tion  from  broadside  to  head  on.  Figure  17  indicates 
the  approximate  angular  extent  of  each  of  the  31 
categories.  Some  vehicles  require  more  categories  than 
others.  A  general  trend  seen  in  this  figure  is  that  only 


VQlUVf  A  NUVBIR  >  ’99.3  I  HE  1I\C()IN  |AR0RAI0Rv  ,fp0R\A! 


137 


•  KOLODZY 

Multidimensional  Automatic  Target  Recognition  System  Evaluation 


one  category  is  necessary  for  each  vehicle  to  distin¬ 
guish  the  vehicles  from  broadside  to  approximately 
45°  of  head  on.  This  result  agrees  with  the  intuitive 
understanding  we  have  when  viewing  the  log-polar 
maps. 

Resolution  Requirements  and  the  Johnson  Criterion 

A  final  test  of  the  ATR  system  is  the  comparison 
between  the  criteria  indicated  by  J.  Johnson  [20]  and 
the  resolution  requirements  for  recognition  and  iden¬ 
tification.  Johnson’s  work  focused  on  determining  the 
imaging  requirements  of  a  sensor  to  produce  a  level  of 
discrimination  and  recognition  for  human  observers. 
The  work  consisted  of  psychovisual  experiments  on 
U.S.  Army  personnel  by  using  image  intensifier  imag¬ 
ery  that  is  similar  in  quality  to  passive-IR  imagery. 
The  personnel  were  shown  images  of  various  vehicles 
at  various  resolutions  and  asked  to  identify  the  ve¬ 
hicles.  The  Johnson  criterion  is  the  number  of  pixels  in 
a  vehicle’s  minimum  dimension  (usually  height)  that 
is  required  for  a  50%  probability  of  correctly  identify¬ 
ing  the  vehicle. 


Object  height  in  pixels 
(Johnson  criteria  ratio) 

FIGURE  18.  Johnson-criterion  test  to  indicate  the  number 
of  pixels  necessary  for  identification  of  a  target.  In  this 
case  the  targets  are  two  jeeps  and  an  APC.  The  number  of 
cumulative  categories  formed  for  a  set  of  training  patterns 
at  each  object  height  in  pixels  is  shown  for  object  heights 
from  13  to  23  pixels.  The  increase  in  the  number  of  catego¬ 
ries  from  the  baseline  case  height  of  23  pixels  indicates  the 
inability  of  the  classifier  to  generalize  the  patterns.  The 
results  shown  in  the  figure  indicate  that  the  classifier  per¬ 
forms  well  up  to  20%  above  the  Johnson  criterion  of  13 
pixels,  as  indicated  by  the  dashed  line  in  the  figure. 

v;0.\  UV.A’f  h  NUMBS  B  "  i'4jj 


We  performed  an  experiment  with  the  spatial  ex¬ 
tent  of  each  pixel  as  the  variable;  this  experiment  was 
identical  to  the  one  on  the  effect  of  CNR  described 
above.  Three  broadside  vehicles  were  used  (two  jeeps 
and  an  APC)  for  the  training,  and  the  number  of 
pixels  in  the  minimum  dimension  were  varied  from 
13  to  23.  Figure  18  indicates  the  number  of  catego¬ 
ries  formed  as  a  function  of  the  number  of  pixels  in 
the  minimum  dimension.  For  complex  vehicle  out¬ 
lines  such  as  the  jeeps,  the  classifier  performs  well  up 
to  20%  greater  than  the  Johnson  criterion  for  identi¬ 
fication.  For  a  much  simpler  vehicle  such  as  the  APC, 
the  classifier  is  more  robust  and  can  still  identify  the 
vehicle  at  the  Johnson  criterion.  For  more  details  on 
the  methodology  and  interpretation  of  results,  see  the 
report  by  Rak  [19]. 

The  ATR  Evaluation  Facility 

Military  applications  require  the  use  of  ATR  systems 
in  both  semi-autonomous  and  autonomous  modes 
(in  a  semi-autonomous  mode  we  believe  in  the  recog¬ 
nition  capabilities  of  the  ATR  system  enough  for  a 
user  to  apply  the  results,  while  in  autonomous  mode 
we  let  the  system  act  on  the  results  on  its  own).  The 
testing  and  acceptance  of  ATR  systems  for  these  mili¬ 
tary  applications  has  proven  to  be  difficult.  The  re¬ 
sources  necessary  to  provide  useful  test  results  are 
usually  overburdening.  Either  we  must  use  large 
amounts  of  real  sensor  imagery,  sometimes  in  mul¬ 
tiple  sensor  modalities,  for  each  given  mission  sce¬ 
nario,  or  we  must  use  synthetically  generated  data. 
The  real  sensor  imagery  requires  expensive  and  time- 
consuming  efforts  to  gather  the  data,  while  the  syn¬ 
thetic  imagery  places  an  inherent  trust  in  the  validity 
of  the  sensor  and  target  models  used  to  generate  the 
synthetic  data.  The  recent  development,  however,  of 
inexpensive  computer  graphics  workstations  and  data- 
processing  engines  has  begun  to  change  the  emphasis 
from  measurement  missions  to  computer-generated 
data. 

We  are  currently  developing  an  ATR  evaluation 
facility  (AEF)  that  exploits  the  recent  developments 
in  computer  graphics  and  data  processing  to  provide 
an  effective  test  environment.  This  facility  merges 
high-resolution  data,  an  electronic  terrain  board  (ETB) 
that  combines  sensor  data  with  synthetic  targets  and 


138 


• KOLODZY 

Multidimensional  Automatic  larger  Recognition  System  Evaluation 


sensor  models,  and  an  ATR  system  that  is  under 
evaluation.  I  he  high-resolution  data  are  taken  in  the 
modalities  of  interest  (laser,  passive  1R,  and  visible) 
and  stored  in  databases.  The  ETB  uses  the  databases 
along  with  the  sensor  and  target  models  to  modify  the 
measured  imagery  for  ATR-system  sensitivity  analy¬ 
ses.  This  section  describes  the  facility  as  well  as  cur¬ 
rent  research  on  its  development. 


Description  of  the  A  JR  Evaluation  Facility 

The  AEF  merges  existing  sensor  data  in  multiple 
modalities  with  synthetic  data  from  sensor  and  target 
computer  models.  Figure  19  shows  the  conceptual 
flow  of  information  in  the  AEF.  The  airborne  IRAR 
sensor  suite,  which  is  described  earlier  in  this  article, 
collects  high-resolution  imagery  in  laser  intensity, 


Sensor 


Databases 


Forward-looking  laser  radar 
(intensity,  range) 

Passive  IR 


Down-looking  laser  radar  I 
(0.8-jum  intensity,  relative  range) 
Down-looking  laser  radar  II 
(10.6-/jm  intensity,  relative  range) 
Passive  IR 


ATR  algorithms 


Electronic  terrain  board 


FIGURE  19.  The  ATR  evaluation  facility  incorporates  the  down-looking  and  forward-looking  sensor  imagery  databases, 
sensor  and  target  models,  the  electronic  terrain  board  (ETB),  and  the  ATR  algorithm  suite.  Image  databases  and  sensor 
and  target  models  are  fused  within  the  ETB,  which  allows  us  to  modify  target  models  and  vary  the  measured  backgrounds. 


VOLUME  6  HKJVBER  '  'M3  THE  USf.0,\  U8ORCOR1  JOi.R'iAl 


139 


• KOLODZY 

Multidimensional  Automatic  Target  Recognition  System  Evaluation 


range,  passive  IR,  and  MMW  in  a  variety  of  wavebands 
and  view  aspects.  These  data  are  stored  in  large  data¬ 
bases  that  are  used  to  refine  the  synthetic  data  created 
from  sensor  and  target  models. 

The  modeling  efforts  and  the  databases  are  merged 
in  the  ETB.  Most  false  alarms  and  missed  detections 
as  well  as  missed  classifications  of  targets  are  due  to 
the  variability  of  background  clutter  signals.  Model- 
based  systems  and  trainable  recognition  systems  are 
developed  by  using  limited  target  signatures  only; 
unfortunately,  these  systems  do  not  develop  internal 
models  for  backgrounds  as  well.  Therefore,  we  must 
find  a  way  to  merge  target  signatures,  which  are  pre¬ 
dominantly  models,  with  background  clutter. 

It  is  difficult,  however,  to  model  background  sig¬ 
nals  because  of  their  variable  and  unpredictable  na¬ 
ture.  The  background  models  are  therefore  the  weak¬ 
est  link  of  a  completely  synthetic  sensor  image.  The 
combination  of  measured  background  imagery  with 
the  more  well-defined  synthetic  target  models  cir¬ 
cumvents  this  problem. 

Terrain  Database 

The  down-looking  laser  radar  sensor  described  earlier 
provides  high-resolution  range  imagery.  This  imagery 
is  a  2'/2-D  representation  of  the  actual  terrain  and 
precludes  the  existence  of  speckle  noise  indicative  of 
intensity  images.  The  2*/2-D  imagery  contains  the 
range  of  the  first  object  or  part  of  object  that  is 
interrogated  for  each  pixel.  Therefore,  any  part  of  an 
object  at  a  further  range  or  area  occluded  is  not 
represented  in  the  data.  The  2'/2-D  notation  indicates 
that  a  full  3-D  image  is  produced,  although  the  way 
we  view  the  scene  from  above  appears  as  if  a  blanket 
were  covering  the  objects  in  the  scene.  Only  the  high¬ 
est  point  of  a  pixel  that  is  interrogated  is  recorded; 
any  part  of  an  object  at  a  longer  range  or  in  an 
occluded  area  is  not  represented  in  the  data.  For 
example,  a  ball  in  midair  viewed  from  above  is  repre¬ 
sented  as  a  hemisphere  on  top  of  a  cylinder  because 
no  information  is  available  on  the  space  below  the 
ball.  Techniques  for  combining  multiple  views  are 
being  investigated  to  alleviate  this  current  limitation. 

The  down-looking  sensor  simultaneously  measures 
range  as  well  as  laser  intensity  and  passive  IR.  The 
existence  of  2Vi-D  range  imagery  lends  each  of  the 


sensor  domains  to  coordinate  transformations.  As  de¬ 
scribed  earlier,  the  ability  to  transform  the  range  data 
and  subsequent  pixel-registered  passive- 1 R  data  al¬ 
lows  the  sensor  imagery  to  be  used  to  train  and  test 
ATR  systems  with  many  viewing  aspects.  The  specific 
method  used  for  the  coordinate  transformation  can 
have  a  dramatic  effect  both  on  the  requirements  for 
computation  and,  more  importantly,  on  the  quality 
of  the  resultant  image. 

Traditionally,  Euler  angles  have  been  used  to  repre¬ 
sent  coordinate  transformations,  and  these  coordinate 
transformations  can  be  expressed  as  3-by-3  rotation 
matrices.  Because  the  computer  graphics  community 
commonly  uses  rotaron  matrices,  most  of  the  special¬ 
ized  hardware  developed  to  perform  coordinate  trans¬ 
formations  employ*  this  method.  This  choice  has  been 
motivated  primarily  by  the  fact  that  translation  and 
scaling  as  well  as  rotation  can  be  represented  by  one 
matrix.  The  same  transformations,  however,  that  can 
be  performed  by  a  matrix  can  be  performed  with 
fewer  operations  by  using  quaternions  [21  ] 

An  important  consideration  in  the  choice  between 
matrices  and  quaternions  for  coordinate  transforma¬ 
tions  occurs  when  we  interpolate  between  two  orien¬ 
tations.  Rotation  matrices  are  not  well  defined  for 
interpolations,  because  rotations  are  carried  out  by 
three  successive  rotations  about  three  fixed  axes.  Be¬ 
cause  these  successive  rotations  are  not  commutative, 
changing  the  order  of  the  rotations  produces  different 
results,  which  introduces  a  significant  problem  known 
as  girnbal lock.  This  problem  occurs  when  the  interac¬ 
tion  of  two  rotations  aligns  two  of  the  three  rotation 
axes  and  causes  a  loss  in  one  degree  of  rotational 
freedom.  Quaternions  are  free  of  this  problem  be¬ 
cause  the  cross-product  interaction  between  succes¬ 
sive  rotations  is  preserved  [21].  Because  of  this  rota¬ 
tional  stability,  the  aerospace  industry  for  many  years 
has  preferred  quaternions  over  matrices  defined  by 
Euler  angles  for  spacecraft  applications. 

Because  most  computer  graphics  workstations  have 
hardware  that  is  specifically  designed  to  implement 
matrix  transformations,  we  must  continue  to  main¬ 
tain  all  viewing  parameters  in  matrix  form.  The  AEF 
system  is  designed  to  perform  all  interpolations  by 
using  quaternions,  which  are  then  converted  to  ma¬ 
trix  form  for  rendering. 


140  fHF  LINCOLN  UBORAIOR,  JOllBSM  VOLUME  B  NIlMBtR  1  I'l'l.l 


• KOLODZY 

Multidimensional  Automatic  I  a  r get  Recognition  System  l valuation 


To  illixstrace  the  use  of  quaternions  for  interpolat¬ 
ing  rotations  we  must  first  define  what  a  quaternion  is 
and  how  it  is  used  to  perform  a  rotation.  A  quater¬ 
nion  consists  of  two  components — a  scalar  part  and  a 
vector  part.  Consider  a  quaternion  q  =  [i,  vj,  where  s 
is  a  scalar,  and  v  is  a  vector  of  three  elements.  In 
quaternion  algebra,  addition  is  defined  as 

</,  +  q2  =  [(•>',  +  *,).(v,  +  v,)j, 

and  multiplication  is  defined  as 

cMi  =  [bfC  -  vi  '  v2),U,v,  +  iiv,  +  v,  x  v,)j, 

where  v,  •  v ,  is  the  vector  dot  product  and  v,  x  v,  is 
the  vector  cross  product. 

Before  we  can  define  rotations  using  quaternions 
we  need  to  define  the  inverse  operation 


wr 

where 

Iff  =  C  +  v-v. 

To  rotate  a  point  p  we  embed  it  into  a  quaternion  as 
[0,p].  Rotation  is  then  defined  as 

v  =  Rot(v )  =  qvq  1  , 

where  <7  and  q  1  are  unit  quaternions. 

One  consideration  associated  with  the  use  of  quater¬ 
nions  for  coordinate  transformations  is  that  rotations 
are  performed  on  the  unit  four-dimensional  hyper¬ 
sphere,  so  that,  as  a  result,  simple  linear  interpolation 
between  two  orientations  gives  unequal  rotations 
through  the  range  of  orientation  values.  The  unequal 
rotations  occur  because  the  great  arc  of  a  unit 
hypersphere  is  the  spherical  equivalent  of  a  line,  and 
the  linear  interpolation  steps  fall  on  unequal  portions 
of  the  line.  These  unequal  rotations  must  be  compen¬ 
sated  for  to  give  a  smooth  set  of  intermediate  trans¬ 
formations.  All  of  the  interpolations  between  speci¬ 
fied  positions  are  performed  by  using  unit  quaternions 
and  spherical  linear  interpolations,  and  then  compen¬ 


sating  for  unequal  rotations  [21  ]. 

Figure  20  shows  an  example  of  the  transformation 
of  down-looking  range  data  into  various  viewing  per¬ 
spectives.  Starting  with  the  range  data  shown  in  Fig¬ 
ure  4,  the  range  data  are  transformed  and  displayed  as 
sand-colored  video  data  and  synthetically  generated 
laser  radar  range  data  in  a  viewing  sequence  typical  of 
a  target  interrogation.  In  effect,  this  series  of  transfor¬ 
mations  is  like  an  observing  eye  on  a  flying  carpet;  it 
begins  at  a  long  standoff  distance  at  a  high  altitude,  it 
detects  a  possible  target,  it  dives  to  a  lower  altitude, 
and  it  flies  along  the  road  to  the  target.  This  series  of 
transformations  demonstrates  how  down-looking  im¬ 
agery  can  be  used  to  train  and  test  an  AI  R  system 
with  many  viewing  aspects. 

Synthetic  Laser  Radar  Imager)' 

An  important  element  in  the  ETB  is  the  combination 
of  synthetic  imagery  and  modified  sensor  imagery. 
Synthetic  imagery  is  derived  from  target  and  back¬ 
ground  models  applied  with  the  appropriate  sensor 
statistics.  In  some  cases,  actual  sensor  imagery  can  be 
modified  to  degrade  the  qualiry  of  the  imagery  for 
test  purposes.  Both  of  these  cases  provide  the  addi¬ 
tional  flexibility  necessary  for  ATR  evaluation.  This 
section  describes  the  methodology  used  in  creating  or 
modifying  laser  radar  imagery. 

The  statistics  describing  a  monostatic  pulsed  rang¬ 
ing  laser  radar  employing  heterodyne  detection  are 
described  by  Hannon  and  Shapiro  [3],  and  were  used 
to  develop  a  model  for  laser  radar  range  data.  This 
laser  radar  model  requires  that  we  select  the  range  and 
CNR  value  for  every  pixel  as  well  as  the  number  of 
range  bins  Q  available  to  the  signal  processor.  The 
probabilities  of  an  anomaly  (equally  distributed  across 
Q-l  range  bins)  and  for  the  correct  range  value  are 
given  by 

■> 

r 

P/  =  1  -  e  cxk+  i 

and 

P/  =  0  . 

1  w 

where  I)  and  Pt  are  the  detected  intensity  prob¬ 
abilities  from  the  correct  range  bins  and  the  wrong 


141 


•  kOHH>/\ 

W lill/iii Hit  'I'inHiii  .  lit liffin/i.U  /./>"(  !  Ai  *  '*]  --It  ",  /. 


(c) 


FIGURE  20.  Down-looking  laser  radar  data  is  transformed  into  a  three-dimensional 
terrain-map  view,  (a)  Photographic  ground  truth  of  a  camouflaged  truck,  (b)  down¬ 
looking  laser  radar  image,  and  (c)  a  sequence  of  f r’ur  views  that  were  formed  by  using 
3-D  transformed  laser  radar  imagery  to  “fly  over"  the  target. 


•  KOLODZY 

Multidimensional  Automata  I arget  Recognition  System  /  valuation 


Geometric  target  models  Synthetic  range  image  Physical  target  models 


FIGURE  21.  Scene  decomposition  and  synthesis  of  a  laser  radar  range  image.  The  original  sensor  image  (upper  center)  is 
decomposed  into  polygonal  background  (upper  left);  statistical,  chaotic,  or  fractal  background  (upper  right);  facet  target 
models  (lower  left);  and  physical  target-background  parameters  (lower  right).  A  new  laser  radar  image  is  then  synthesized 
(lower  center). 


range  bins,  respectively.  These  two  probabilities  are 
used  in  conjunction  with  a  random-number  genera¬ 
tor  to  provide  two  random  draws.  The  maximum 
value  of  the  two  random  draws  is  selected  as  the 
intensity  /. 

The  application  of  the  described  statistics  can  be 
demonstrated  through  an  example.  Figure  21  depicts 
the  decomposition  of  a  sensor  laser  radar  range  image 
into  four  primary  parts.  The  background  and  target 
models  constitute  three  parts:  polygonal  background 
models  for  relatively  uniform  terrain;  statistical,  cha¬ 
otic,  or  fractal  models  for  fragmented  terrains  such  as 
foliage;  and  the  geometric  target  models  generated 
from  wireframe  or  facet  libraries  (ERIM)  or  solid 


geometry  libraries  (U.S.  Army  Ballistic  Research  Labo¬ 
ratory).  The  physical  target  models,  which  are  de¬ 
scribed  by  the  CNR  for  each  pixel,  are  used  to  com¬ 
pute  the  intensity  and  range  value  by  using  the  Hannon 
laser  radar  model  [3].  In  the  example  shown,  a  variety 
of  fractal  dimensions  were  attempted  and  a  best  visual 
fit  was  selected  for  the  foliage.  A  uniform  CNR  value 
of  17  dB  (which  is  a  typical  value  for  imaging  generic 
terrain  by  the  IRAR  sensor  at  700  m)  was  used  to 
generate  the  synthetic  range  image.  The  generated 
image  is  visually  similar  to  the  original  sensor  image. 

Summary 

A  flyable,  multisensor  system  has  the  ability  to  mea- 


V01UWE  6  CUMBER  '  '993  THE  LINCOLN  LABORATORY  JOURNAL 


143 


•  KOLODZY 

Multidimensional  Automatic  Target  Recognition  System  t  valuation 


sure  a  combination  of  range,  Doppler,  laser  intensity, 
and  thermal  signatures  in  both  the  forward-looking 
and  down-looking  aspects.  Statistical  advantages  for 
incorporating  multidimensional  information  exist  for 
target-detection  applications  using  theoretical  analy¬ 
ses  and  heuristic  algorithms.  The  use  of  multiple  sen¬ 
sor  modalities  also  provides  some  hope  to  address  the 
vexing  issues  of  ATR. 

A  modular,  hybrid  ATR  system  has  been  described 
that  fuses  statistical,  model-based,  and  neural  net¬ 
work  processing  structures.  The  system  has  been  tested 
on  laser  radar  range  imagery  as  well  as  synthetic  range 
imagery  incorporating  pulsed  laser  radar  statistics. 
Results  created  by  using  the  synthetic  imagery  indi¬ 
cate  that  target  identification  can  occur  in  imagery 
with  over  50%  of  the  pixels  corrupted  by  noise.  Tests 
with  out-of-plane  rotated  vehicles  indicate  that  a  fi¬ 
nite  number  of  nonuniform  angularly  spaced  projec¬ 
tions  can  be  learned  by  the  system  to  provide  target 
identification.  The  current  system  can  also  provide 
identification  with  spatial  resolution  as  low  as  20% 
above  the  Johnson  criteria. 

To  continue  to  test  and  evaluate  complicated  ATR 
systems,  an  ATR  evaluation  facility  is  being  constructed 
to  provide  real,  synthetic,  and  hybrid  sensor  image 
input  to  a  selected  ATR.  This  facility  uses  the  avail¬ 
able  high-resolution  down-looking  laser  radar  range 
imagery  and  high-fidelity  target  models  to  generate 
the  various  operational  scenarios. 

Acknowledgments 

The  work  described  in  this  article  is  the  combined 
effort  of  a  large  group  of  scientists  and  technical 
support  personnel.  The  author  acknowledges  the  ef¬ 
forts  of  R.  Hull,  T.  Quist,  and  S.  Prutzer  for  the 
development  of  the  multisensor  measurement  system, 
and  the  efforts  of  D.  Biron,  E.  Van  Allen,  S.  Rak,  M. 
Menon,  and  J.  Baum  for  the  development  of  the  ATR 
system  and  the  subsequent  algorithms.  The  author 
also  thanks  A.  Gschwendtner,  P.  DiCaprio,  and  H. 
Thomas  for  their  contributions  to  the  ATR  evalua¬ 
tion  facility. 

This  work  was  sponsored  by  the  Advanced  Re¬ 
search  Projects  Agency,  the  Balanced  Technology  Ini¬ 
tiative  Program,  and  the  U.S.  Air  Force  under  an 
Electronic  Syst<  ms  Division  contract. 

144  IMF  1  IM'.IHN  i. ABORA l ORV  JOURNAl  VOI IIMI  ti  NUMBF.R  < 


• KOLODZY 

Multidimensional  Automata  larget  Recognition  System  Evaluation 


R  E F  FRENCH S 

I  1).  Biron,  R.  Hull,  and  I  .  Quist,  private  communication. 

2.  M  B.  Mark,  Multipixel,  Multidimensional  Ltser  RtuLtr  System 
Performance,  PhD.  thesis.  Ml  I  Dept,  of  Klectrical  Knginecr- 
ing  atrd  ( Computer  Science  ( 1986). 

3.  S.M.  Hannon  and  |.H.  Shapiro,  "laser  Radar  Target  Detec¬ 
tion  with  a  Multipixel  Joint  Range-Intensity  Processor,  SP/I: 
999,  162  (1989), 

4.  S.  tic-man  and  D.  tic-man,  “Stochastic  Relaxation,  Oibbs  Dis¬ 
tributions,  and  the  Bayesian  Restoration  of  Images,”  I  EEL 
(rails.  Pattern  Recognition  Mach.  Intel/.  PAM1-6,  7 2.1  (1984). 

3.  S.  ( Irossberg  and  1-..  Mingolla,  “Neural  Dynamics  of  Per¬ 
ception  tlrouping:  Textures,  Boundaries,  and  I  mergcnt 
Segmentations,"  Perception  anil  Psychophysics  38,  141 
( 1985). 

6.  D.  Hubei  and  I  Wiesel,  “i'unctional  Analysis  of  Macaque 
Monkey  Visual  Cortex,  Prnc.  Royal  Stic,  of  London  (R)  198,  I 
(1977).' 

7.  P.J.  Kolodzy,  M.M.  Menon,  and  h.V.  Allen,  private  commu¬ 
nication. 

8.  K.J.  Van  Allen  and  P.J.  Kolodzy,  "Application  of  a  Boundary 
(ion tour  Neural  Network  to  Illusions  and  Infrared  Sensor 
Imagery,”  IEEE  Eirst  Inti.  C.onf.  on  Neural  Networks  4,  San 
Diego.  21-24  June  1987,  p.  193. 

9.  K.J.  Van  Allen,  private  communication. 

10.  J.t I.  Verly,  R.L  Delanov,  and  D.K.  Dudgeon,  “Machine  In¬ 
telligence  I  echnologv  tor  Automatic  Target  Recognition," 
Line.  Lab.  J.  2,  277  (1989). 

11.  D.K.  Dudgeon,  P.J.  Kolodzy,  C.  Mehanien,  M.M.  Menon. 


S.J.  Rak,  K.J.  Van  Allen,  J.t i.  Verb,  and  R.l  .  Delanos,  private 
communication. 

12.  K.  Schwartz,  “Computational  Anatomy  and  functional  Archi¬ 
tecture  of  Striate  Cortex:  A  Spatial  Mapping  Approach  to 
Perceptual  Coding,”  Vision  Research  20.  6-47  1 1980). 

13.  S.J.  Rak,  D  C.,  Biron.  P.J,  Kolodzy.  M.M.  Menon,  and  f..|. 
Van  Allen,  private  communication. 

14.  C.  C  arpenter  and  S.  ( irossberg,  “Neural  1  )v  uatuics  ol  (  augu¬ 
rs'  l  earning  and  Recognition:  Attention.  Mentors  Consolida¬ 
tion.  and  Amnesia,  in  Rratn  Structure.  I  earning,  and  Memory 
eds.  J.  Davis.  R,  Newburgh,  and  K.  Wegman  (AAA  Sympo¬ 
sium  Series,  1983).  pp.  1-49. 

13.  R.  1  ippmann,  "An  Introduction  to  Computing  with  Neural 
Networks,  IEEE.  ASM' Mag.  3,4  <Apr.  19g~). 

16.  M.M.  Menon  and  P.J.  Kolodzy,  "A  Comparative  Studs  of 
Neural  Network  Classifiers,"  presented  at  the  f  irst  Inti.  Neu¬ 
ral  Networks  Setup.  (INNS),  Sept.  1988. 

17.  (i.  Carpenter  and  S.  (.irossberg.  '  Self-Organization  of  Pattern 
Recognition  Codes  lor  Analog  Input  Patterns.”  Appl  Dpt  26, 
4919  (Dec.  1987). 

18.  J.H.  Shapiro.  R.W.  Reinhold,  and  D.  Park,  “Performance 
Analyses  lor  Peak-Detecting  Laser  Radars,  SPIE  663,  38 
(1986). 

19.  S.  Rak  and  P.  Kolodzy,  "Performance  of  a  Neural  Network 
Based  3-D  ATR  System, ”  Project  Report  A7V-6,  Lincoln  labo¬ 
ratory  (May  1991). 

20.  J.  Johnson,  “Analysis  ol  Image  forming  Systems,"  in  Proc 
Image  Intensijier  Symp.  (U.S.  Army  KRDL,  fort  Belvoir,  VA. 
Oct'.  1938). 

21 .  K.  Shoemake.  "Quaternion  Calculus  for  Animation,  in  "Math 
lor  SRitiRAPH,”  ACM  SRjCiRAPH  89  Course  Notes,  pp. 
187-203  (Boston,  MA),  July  31-August  4,  1989. 


vniiiyt  i.  MIV8S  R  - 


145 


•  KOLODZY 

Multidimensional  Automatic  Target  Recognition  System  t.  valuation 


PAUL  J.  KOLODZY 

is  assistant  leader  of  the  Opto- 
Radar  Systems  group.  He 
received  a  B.S.  degree  from 
Purdue  University,  and  M.S. 
and  Ph.f).  degrees  from  ( iase 
Western  Reserve  University,  all 
in  chemical  engineering.  His 
current  areas  of  research  include 
neuromorphic  systems,  auto¬ 
matic  target  recognition  sys¬ 
tems,  and  advanced  distributed 
simulation  technology.  In 
1983,  he  was  a  visiting  scientist 
working  in  the  area  oflaser 
remote  measurement  systems  at 
Riso  National  laboratory  in 
Roskilde,  Denmark.  He  was 
cochairman  of  the  Simulation/ 
Emulation  Tools  &  Techniques 
Panel  of  the  ARPA  Neural 
Network  National  Study,  and 
he  is  a  member  of  the  Sensor 
Fusion  program  committee. 
Paul  has  been  at  Lincoln  labo¬ 
ratory  since  1986. 


146 


rtlf  UNCmH  UBOMIORi1  JOllftNltl  VQLIIMI  6  SBWBfR  '  1993 


An  Efficient  MRF  Image- 
Restoration  Technique  Using 
Deterministic  Scale-Based 
Optimization 

Murali  M.  Mcnon 

■  A  method  for  performing  piecewise  smooth  restorations  on  images  corrupted 
with  high  levels  of  noise  has  been  developed.  Based  on  a  Markov  Random  Field 
(MRF)  model,  the  method  uses  a  neural  network  sigmoid  nonlinearity  between 
pixels  in  the  image  to  produce  a  restoration  with  sharp  boundaries  while 
providing  noise  reduction.  The  model  equations  are  solved  with  the  Gradient 
Descent  Gain  Annealing  (GDGA)  method — an  efficient  deterministic  search 
algorithm  that  typically  requires  fewer  than  200  iterations  for  image  restoration 
when  implemented  as  a  digital  computer  simulation.  A  novel  feature  of  the 
GDGA  method  is  that  it  automatically  develops  an  annealing  schedule  by 
adaptively  selecting  the  scale  step  size  during  iteration.  The  algorithm  is  able  to 
restore  images  that  have  up  to  71%  of  their  pixels  corrupted  with  non-Gaussian 
sensor  noise.  Results  from  simulations  indicate  that  the  MRF-based  restoration 
remains  useful  at  signal-to-noise  ratios  5  to  6  dB  lower  than  with  the  more 
commonly  used  median-filtering  technique.  These  results  are  among  the  first 
such  quantitative  results  in  the  literature. 


An  image- restoration  method  that  reduces 
noise  while  preserving  naturally  occurring 
boundaries  in  a  scene  is  presented.  The 
method  is  useful  as  a  preprocessor  to  enhance  the 
performance  of  automatic  target-recognition  systems. 

larger  recognition  is  a  process  that  can  involve 
many  stages,  including  measurement,  preprocessing, 
detection,  segmentation,  feature  extraction,  and  clas¬ 
sification.  For  adequate  recognition  performance  in  a 
noisy  environment,  it  is  often  important  that  the 
preprocessing  stage  be  capable  of  restoring  measured 
images.  (We  justify  this  statement  in  the  section  “Simu¬ 
lation  Results. ")  1  he  restoration  should  reduce  the 
variability  in  the  scene  that  results  from  measurement 
noise  and  clutter  while  preserving  important  features 
that  make  targets  separable  in  the  classification  stage. 


Both  objectives  can  be  accomplished  by  using  prior 
statistical  knowledge  of  the  measurement  process  and 
the  clutter  in  the  scene,  or  bv  using  an  empirical 
formulation  of  the  desired  restoration.  (Details  of 
using  either  a  statistical  or  empirical  formulation  are 
contained  in  the  following  section.) 

Using  the  latter  approach,  the  work  described  in 
this  article  is  based  on  an  empirical  image-restoration 
model  that  requires  nearest  neighbor  pixels  to  have 
similar  values  (smoothing),  without  losing  fidelity 
to  the  original  measurement.  The  pixel  interaction 
of  the  model  smooths  small  pixel  differences, 
but  allows  large  differences  to  remain  as  a  discontinu¬ 
ity  (edge).  If  detailed  statistical  information  concern¬ 
ing  the  measurement  and  scene  is  available,  the  infor¬ 
mation  can  be  quantitatively  incorporated  into  the 


14- 


•  MENON 

An  lijjinenl  MRl:  Image  Restoration  leehnufue  Using  Deterministic  Scale-Bused  Optimization 


image-restoration  model. 

This  article  describes  an  image-restoration  model 
that  is  based  on  a  neural  network  formulation  using 
Markov  Random  Fields  (MRF),  as  dest  .bed  in  the 
box,  "Markov  Random  Fields.”  In  the  model,  a  neu¬ 
ral  network  sigmoid  function  provides  pairwise  pixel 
interaction  potentials.  The  function  behaves  quadrati- 
cally  for  small  differences  but  saturates  for  large  dif¬ 
ferences.  The  MRF  property  of  the  model  allows  an 
image  to  be,  in  effect,  decoupled  into  a  large  number 
of  connected  local  neighborhoods,  each  of  which  can 
be  processed  independently.  The  local-neighbor  in¬ 
formation  is  propagated  during  iteration  such  that  a 
global  image  restoration  is  effected  when  the  system 
reaches  a  steady  state.  The  restored  image  can  be 
found  by  solving  an  optimization  problem  that  de¬ 
pends  on  the  pixel  interaction  potentials.  The  MRF 
property  that  allows  each  pixel  update  to  depend  only 
on  a  local  neighborhood  of  pixels  eases  the  computa¬ 
tional  burden.  For  the  case  of  a  Gaussian  pixel  inter¬ 
action,  the  potential  function  is  quadratic,  leading  to 
a  simple  optimization  problem  that  involves  the  solu¬ 
tion  of  a  large  set  of  linear  equations.  For  sigmoid 
interaction  potentials  (the  present  work),  a  difficult 
high-dimensional  nonlinear  optimization  problem  re¬ 
sults.  Stochastic  methods  are  commonly  used  to  solve 
such  problems,  but  such  methods  are  often  very  slow 
and  sensitive  to  the  choice  of  annealing  schedule.  We 
propose  the  novel  deterministic  Gradient  Descent 
Gain  Annealing  (GDGA)  method  for  solving  high¬ 
dimensional  nonlinear  optimization  problems.  This 
method  is  fast  and  automatically  chooses  an  annealing 
schedule.  GDGA  is  used  to  solve  optimization  prob¬ 
lems  resulting  from  the  neural-network-based  MRF 
image-restoration  model.  Previous  deterministic  an¬ 
nealing  work,  such  as  mean  field  annealing  [1,  2], 
does  not  incorporate  an  automatic  annealing  sched¬ 
ule. 

The  utility  of  the  MRF  model  in  restoring  images 
corrupted  with  varying  levels  of  non-Gaussian  mea¬ 
surement  noise  has  been  investigated.  Model  perfor¬ 
mance  has  been  evaluated  quantitatively  in  terms  of 
target  detection  and  recognition,  and  the  performance 
has  been  compared  to  that  of  the  commonly  used 
median-filtering  technique.  The  quantitative  results 
reported  in  this  article  are  among  the  first  such  results 


ARORA 1  nf?v  ,iri!T:\A 


in  the  literature.  Because  of  prohibitive  computa¬ 
tional  requirements,  few  quantitative  characterizations 
of  image  restoration  algorithms  have  been  performed. 
Most  work  in  the  literature  has  compared  the  restored 
imagery  qualitatively,  rather  than  determining  the  ef¬ 
fect  of  the  restoration  stage  on  the  overall  system 
performance. 

The  same  model  can  be  applied  to  a  large  number 
of  sensor  measurements  (Doppler,  intensity,  passive 
infrared,  range,  and  video)  by  the  adjustment  of  a 
single  paiameter.  This  feature  is  especially  relevant  for 
hardware  implementation  because  it  allows  a  single 
chip  to  be  used  for  processing  a  wide  variety  of  imag¬ 
ery.  The  model  has  a  massively  parallel  architecture 
with  local  neighbor  pixel  interactions  (four  nearest 
neighbors)  and  can  be  implemented  on  a  parallel- 
processing  computer  or  a  custom  analog  VLSI  chip. 
Implementation  of  the  model  in  analog  VLSI  would 
allow  video-rate  restoration  of  512  x  512  pixel 
images. 


Background 

In  this  section  the  Bayesian  formulation  of  image 
restoration  is  reviewed  to  show  the  formal  connec¬ 
tions  to  the  restoration  method  that  is  the  subject 
of  this  article.  The  Bayesian  formulation  relates  the 
posterior  probability  that  an  estimate  of  the  true 
image  xr  is  obtained  given  a  measured  image  x  "  and 
the  prior  probabilities: 


P(xr  xm) 


P(xnV)P(.vr) 

P(0 


0) 


The  term  P(x"'|  x'  )  incorporates  prior  knowledge  of 
the  measurement  process,  and  P(xr)  incorporates  prior 
knowledge  of  the  scene.  The  present  model  finds  an 
estimate  x  that  approximately  maximizes  P(.vr|.vm) 
given  the  measurement  ,vm  and  prior  knowledge  of 
the  scene  in  the  form  of  P(xr).  In  the  present  model 
each  pixel  depends  only  on  its  four  surrounding  neigh¬ 
bors  and  the  measured  pixel  as  shown  in  Figure  1. 

The  probabilistic  (Bayesian)  formulation  is  equiva¬ 
lent  to  a  physical  system  description  in  terms  of  an 
energy  [3]: 

L.  x  _  log  P(.v).  (2) 


148 


■;0i. IJW  fi 


•  MfcNON 

An  f'ffieient  .Mh '/■  Image  -Restoration  I  eehn/tjne  (  ang  l  >etertmni>iic  S eale-Ha>e(i  Ojaimizaimn 


MARKOV  RANDOM  FIELDS 


A  SERIES  OF  EVENTS  ill  time 
form  a  Markov  Chain  if  the  prob¬ 
ability  of  the  outcome  of  an  event 
at  time  t  +  1  depends  only  on  the 
outcome  of  the  event  at  time  t. 
This  concept  can  also  be  ap¬ 
plied  to  processes  on  a  lattice. 
A  Markov  Random  Field  (MRF) 
defined  on  a  lattice  implies 
that  the  update  of  a  pixel  at 
site  ij  depends  only  on  the  val¬ 
ues  of  pixels  in  a  local  neigh¬ 
borhood  of  sites  Nj:  (Fig¬ 
ure  A).  In  terms  of  conditional 
probabilities, 

P(X,y  =  X'j\  X/k  -  X/k, 

Ik  E  lattice,  Ik  *  ij) 

—  I  (Xjj  —  X,y|  X —  X/fc, 

Ik  E  N-j), 

where  X-  is  the  real-value  distri¬ 


bution  of  a  random  variable  asso¬ 
ciated  with  lattice  site  ij  and  xtj  is 
the  specific  value  of  the  variable 
at  that  site.  Thus  the  definition  of 
an  MRF  on  a  lattice  transforms 
a  global  problem  into  a  more 
computationally  tractable  local 
problem. 

It  is  also  true  that  an  MRF  on 
a  lattice  has  the  following  energy- 
based  formulation: 

-(/(*) 

PM  = - , 

Z 

where  U  is  the  global  potential 
function  for  the  entire  lattice  and 
Zis  the  partition  function,  which 
normalizes  the  probability  P(x)  to 
a  range  from  0  to  1 . 

In  the  present  work  the  energy 
£(x)  is  defined  over  all  indepen¬ 
dent  pairs  of  sites  p  on  the  lattice: 

u(x)  =  £  Ep(x). 


The  specific  energy  of  inter¬ 
action  for  a  pair  of  sites  is  given 
by  a  sigmoid  function: 


where  the  gain  (/)  <  0)  defines 
the  scale  of  the  sigmoid,  as  shown 
in  Figure  B.  A  small  magnitude 
of  the  gain  produces  a  large-scale 
(broad)  sigmoid,  while  a  large 
magnitude  of  the  gain  produces  a 
small-scale  (narrow)  sigmoid.  For 
either  case,  the  sigmoid  function 
has  the  property  that  the  response 
saturates  after  the  input  exceeds  a 
certain  level.  For  a  high  magni¬ 
tude  of  the  gain,  note  that  the 
sigmoid  saturates  very  quickly, 
even  for  small  inputs.  The  gain  in 
the  sigmoid  function  is  inversely 
proportional  to  the  temperature 
of  an  energy-based  formulation. 


U  ENlf 


ij 


EPM 


FIGURE  A.  Markov  Random  Field  (MRF)  defined 
on  a  lattice.  In  the  figure,  the  update  of  a  pixel  at 
site  ij  depends  only  on  the  values  of  pixels  at  sites 
Ik  in  a  local  neighborhood  of  sites  /V,y. 


FIGURE  B.  Sigmoid  function  Ep(x)  for  different  magnitudes 
of  the  gain  p.  A  small  magnitude  of  the  gain  produces  a 
large-scale  (broad)  sigmoid,  while  a  large  magnitude  of  the 
gain  produces  a  small-scale  (narrow)  sigmoid. 


•  MENON 

An  I'lfii  icnt  MK!:  Image- Restoration  Technique  l!sing  Deterministic  Scale- Rased  Optimisation 


Thus  an  M Rl  imago  processor  may  bo  specified  by 
defining  the  energy  Junction  rather  than  the  prob¬ 
abilities.  This  empirical  approach  is  used  in  the  present 
work. 

Model  Description 

Kquation  2  indicates  that  a  minimization  of  the  en¬ 
ergy  will  result  in  a  maximization  ol  the  probability 
l‘(.v'  l.v"').  The  total  system  energy  can  be  expressed  as 
the  sum  of  a  field  term  (which  is  due  to  the  measured 
image)  and  a  surround  term  (which  is  due  to  the 
neighbor  interactions): 

I:  =  A/:'1  +  /:'\  (3) 


The  field  coupling  A  in  Kquation  3  is  an  adjustable 
parameter  that  determines  the  importance  of  the  mea¬ 
surement  term  relative  to  the  surround  term:  a  small 
value  of  A  produces  a  highly  smoothed  image  with 
little  contribution  from  the  measured  image,  whereas 
a  large  value  essentially  reproduces  the  measured  im¬ 
age.  The  sigmoid  function  is  used  in  both  terms.  The 
field  term  is  given  by 


7:'1  =  I 


"  1  +  /  <■*■;;>• 


where  A"1  is  the  difference  between  the  restored  and 
measured  pixels  (i.e..  A'”  =  .v';  -  .v'"  ),  and  p  is  the 
saturation  gain  term.  I  he  sigmoid  function  is  also 
used  for  the  surround  term: 


1  +  /(A'/ 


(5) 


where  &  is  the  surround  pair  difference,  i.e., 
A'/(  =  ,v;,|  -  ,v^ , ,  where  p  refers  to  all  independent 
nearest  neighbor  pixel  pairs  in  the  image,  and  p\ 
and  pi  refer  to  the  members  of  a  pair).  Note  that 
for  an  M  X  N  lattice  there  are  (N  -  1)3/ 
horizontal  pairs  and  ( M -  I  )N vertical  pairs  for  a  total 
of  2MN—  M—  N  independent  pairs. 

I  he  estimate  of  the  original  image  x  that  mini¬ 
mizes  the  system  energy  is  obtained  with  a  determin¬ 
istic  search  procedure.  I  he  present  work  uses  the 
CilKiA  deterministic  search  (described  in  the  subsec- 


150  'HI  I  IMilil  S  ItmimillKV  .IIIIIHSA!  7iHIH.ll  t.  Mlt.lRIK  I  I'i'iX 


Surround  interaction 


FIGURE  I.  Nearest-neighbor  architecture  used  in  the 
Markov  Random  Field  (MRF)  image  restoration. 


tion  “Deterministic  Solution”)  to  decrease  the  satura¬ 
tion  gain  term  [i  in  Kquations  4  and  3  from  -0.001  to 
-10.0.  (Note  that  the  gain  is  negative  in  the  present 
formulation.) 

I  he  saturating  aspect  ol  the  sigmoid  function  (from 
the  neural  network  literature  |4])  in  Kquation  5  al¬ 
lows  the  formation  of  sharp  boundaries  between  dis¬ 
similar  regions.  I  he  main  advantage  of  using  a  sig¬ 
moid  surround  term  is  that  sharp  segmentations  can 
be  obtained  without  a  separate  “line  process”  |3], 
which  would  require  solving  2MN  —  M  —  N  extra 
equations.  Hence  the  sigmoid  term  clearly  reduces 
the  computational  load.  For  the  same  reason,  a  sig¬ 
moid  function  is  used  for  the  field,  or  measurement, 
term.  I  he  sigmoid  function  solves  the  problem  of 
providing  smoothing  (noise  reduction)  while  preserv¬ 
ing  naturally  occurring  boundary  information  in  the 
scene. 

Optimization  Methods 

Stochastic  Solution 

To  solve  the  nonlinear  optimization  problem  sug¬ 
gested  by  Kquation  3,  researchers  have  often  attempted 
stochastic  methods,  which  do  not  require  the  deri¬ 
vative  of  the  energy  with  respect  to  the  restored 


MKNON 


An  Hjjlcienl  A1 Rh  Image- Restoration  I  echniifue  l  !si tig  Deterministic  Scale- Bused  Optimization 


state  and,  as  a  consequence,  can  be  used  for  a 
wide  range  of  optimization  problems.  Stochastic 
methods  are  also  well  suited  tor  high-dimensional 
problems  that  are  characterized  by  many  acceptable 
solutions  (restored  states)  all  having  approximately 
the  same  energy.  Image  restoration  requires  a  solution 
with  low  energy,  but  does  not  need  the  global 


minimum. 


The  present  work  derives  a  stochastic  solution  by 
relating  the  statistical  description  of  the  problem  to 
an  energy-based  representation.  By  making  a  cor¬ 
respondence  to  a  physical  system  at  thermal  equi¬ 
librium,  we  can  express  the  formulation  in  Equa¬ 
tion  1  in  terms  of  minimizing  the  energy  of  a 
system.  The  probability  that  a  physical  system  in 
equilibrium  with  a  heat  bath  at  temperature  T  is  in 
state  i  with  energy  £,  is  given  by  the  Boltzmann 
distribution: 


Py  (*-/)-•£ -  , 

Z(T) 

where  is  the  Boltzmann  constant  and  Z(  T)  is 
the  partition  function,  which  is  simply  the  sum  of 
the  exponential  term  £;/(£B£)  over  all  possible 
states  i. 

It  is  assumed  that  the  solutions  to  the  optimization 
problem  are  equivalent  to  the  states  of  a  physical 
system  and  the  cost  of  a  solution  corresponds  to  the 
energy  of  a  state.  Asymptotic  convergence  to  a  set  of 
globally  optimal  solutions  can  be  obtained  provided 
that  the  different  states  are  generated  properly  and  the 
appropriate  conditions  are  used  to  decide  whether  a 
given  state  should  be  accepted  [5].  Stochastic  meth¬ 
ods  for  solving  the  optimization  problem  involve  start¬ 
ing  at  a  high  temperature  and  annealing  (i.e.,  re¬ 
ducing)  the  temperature  until  the  system  “freezes”  to 
the  minimum  energy  state.  Ideally,  the  proce¬ 
dure  would  be  implemented  reversibly  such  that  the 
system  is  always  at  thermal  equilibrium  and  a  true 
global  minimum  is  reached  rather  than  a  metastable 
state. 

Stochastic  methods  for  solving  nonlinear  optimi¬ 
zation  problems  typically  use  a  simulated  annealing 
method  [6j  combined  with  a  Monte  Carlo  technique 
such  as  the  Metropolis  algorithm  [71  or  the  Gibbs 


sampler  [8j.  For  a  comprehensive  study  that  investi¬ 
gates  the  application  of  simulated  annealing  to  image 
reconstruction,  see  Reference  3  by  S.  Geman  and 
D.  Geman.  The  problem  with  such  stochastic  solu¬ 
tion  techniques  is  that  a  good  annealing  schedule  is 
difficult  to  determine,  and  the  solution  time  can  be 
prohibitive  in  terms  of  the  number  of  iterations  re¬ 
quired  because  an  equilibrium  must  be  reached  at 
each  stage  of  annealing.  At  high  temperatures  a  large 
temperature  step  is  possible  because  the  search  covers 
a  wide  range  of  the  state  space.  As  the  temperature  is 
lowered,  however,  the  system  often  reaches  a  critical 
point  below  which  the  state  is  “frozen,”  analogous 
to  the  phase  diagram  of  real  physical  systems.  If 
the  critical  point  on  the  energy-versus-temperature 
curve  were  known,  then  large  steps  could  be  taken 
before  the  critical  point  were  reached  and  small  steps 
afterwards.  Unfortunately,  the  “phase  diagram”  de¬ 
pends  on  the  initial  measurement,  or  field,  term. 

In  practice  a  conservative  annealing  schedule  is 
often  used: 


where  k  is  the  iteration  number  and  T  is  the  temp¬ 
erature.  Such  a  schedule  can  require  hundreds  of  thou¬ 
sands  of  iterations  or  more  to  produce  an  acceptable 
restoration.  Automated  Local  Annealing  (ALA)  has 
been  suggested  to  provide  an  automatic  annealing 
schedule  for  neural  networks  [9],  but  the  procedure 
is  not  directly  applicable  to  an  image-restoration 
formulation. 

Another  problem  is  that  the  computational  ex¬ 
pense  of  the  stochastic  method  also  depends  on  rhe 
number  of  allowable  states  per  image  pixel.  An  image 
with  8-bit  pixels  requires  many  more  iterations  for 
the  full  exploration  of  the  state  space  as  compared  to, 
for  example,  a  4-bit  image.  Indeed,  the  solution  of 
such  nonlinear  optimization  problems  remains  a  chal¬ 
lenging  research  area. 

Deterministic  Solution 

The  large  number  of  iterations  that  the  stochastic 
approach  requires  in  practice  has  motivated  the  use  of 
a  deterministic  solution  technique  to  solve  the  non¬ 
linear  image-restoration  problem.  The  deterministic 


VO  ii  V  F  MiVBfS 


IMF  l  i\f.U!  \  :  ABOHSGip  ,  .  m.'R’.G 


151 


! 


•  MKNON 

Ah  Efficient  M RE  Image- Restoration  Technu/ue  Using  Determinism  Scale-Based  Optimization 


approach  attempts  to  minimize  the  system  energy  by 
iteratively  updating  pixel  values  across  the  lattice  until 
a  steady  state  is  reached.  In  the  approach,  the  use  ol 
high  gain  values  lor  the  iterative  solution  to  Equation 
3  produces  a  restored  image  with  sharp  boundaries. 
(Note:  In  the  analogy  ol  the  physical  system  discussed 
earlier,  a  high  gain  value  corresponds  to  a  low  tem¬ 
perature,  or  a  small  scale  in  that  a  small  change  in  the 
input  to  the  sigmoid  function  will  produce  a  large 
change  in  the  output.)  The  use  of  high  gain  values, 
however,  will  most  likely  lead  to  the  procedure’s  being 
trapped  in  a  local  minimum.  To  remedy  this  problem 
we  have  developed  the  GDGA  technique,  which  starts 
the  solution  procedure  at  a  low  gain  (i.e.,  a  high 
temperature,  or  large  scale).  The  intermediate  solu¬ 
tion  at  low  gain  is  then  used  as  an  initial  condition  to 
the  problem  at  a  higher  gain,  and  the  procedure  is 
repeated  until  the  final  desired  gain  values  are  achieved. 
Solving  a  series  of  problems  each  at  higher  gain  values 
is  equivalent  to  temperature  annealing  in  the  stochas¬ 
tic  approach.  In  addition,  we  have  developed  an  auto- 
//wr/c annealing  (gain  increase,  or  scale  decrease)  sched¬ 
ule  that  is  described  below. 

An  equation  ol  motion  based  on  the  total  energy 
from  Equation  3  is  defined  by 


dxx 

r)t 


-V  x.r  /:’ , 


(6) 


where  t  represents  a  pseudo-time  quantity.  If  Equa¬ 
tions  4  and  5  are  substituted  for  the  total  energy  term 
in  Equation  6,  then  the  equation  of  motion  for  a 
single  pixel  at  a  lattice  site  ij  is 


(It 


l 


— E 


/i  l‘  t  *rn  -.1 

\  1  +  e1  (A«-’  > 


—7 

()x  /w/’ 

'  If 


V  1  +  e 11  UV"  / 


where  Z/£  refers  to  all  of  the  lattice  sites  in  the  image 
and  Z^,  refers  to  all  of  the  independent  pixel  pairs  in 
the  lattice.  In  rh  ■  present  work  the  MRE  is  given  by 
the  two  horizontal  and  two  vertical  pairs  associated 
with  a  given  lattice  site  ij,  resulting  in  a  neighborhood 


/V;/  given  by 

Ni,  =  {->f, +!,>.  *,.>  +  !•  T .  /  i }  ■ 

Hence,  with  this  local  neighborhood  the  update 
of  a  pixel  at  lattice  site  ij  depends  only  on  the  pixel’s 
four  nearest  neighbors. 

The  objective  is  to  find  the  steady-state  solution  to 
Equation  6  that  results  in  a  state  x  that  minimizes  the 
system  energy.  The  particular  form  of  Equation  6,  the 
equation  of  motion,  guarantees  that  the  steady-state 
solution  minimizes  the  energy.  This  relationship  can 
be  shown  by  using  the  identity 

()l:.  dt'  dx 

lh=JS~dt~  (7) 

and  substituting  for  dx'ldt  from  Equation  6  into 
Eq  nation  7,  resulting  in 

dt  _  dF 
dt  x  dx' t 

In  the  present  work  the  GDGA  deterministic  tech¬ 
nique  is  used  to  minimize  the  energy.  This  formu¬ 
lation  is  similar  to  the  Graduated  Non  Convexity 
(GNC)  approach  of  A.  Blake  and  A.  Zisserman 
1 10]  and  the  technique  used  by  Y.G.  LeClerc  [11], 
and  has  some  similarity  to  mean  field  annealing 
[  1  ].  We  have  found  GDGA  to  be  substantially  faster 
than  the  stochastic  techniques  described  in  the  litera¬ 
ture.  The  GDGA  technique  iteratively  solves  Equa¬ 
tion  6  by  calculating  the  gradient  of  the  energy  and 
updating  the  state  (similar  to  an  Euler  solution  of  a 
system  of  coupled  differential  equations).  In  this 
approach  the  magnitude  of  the  gain  terms  $  and 
p  in  Equations  4  and  5  are  increased  from  a  value 
starting  at  0.001.  At  small  gain  magnitudes  the  res¬ 
toration  acts  to  smooth  the  image  because  the  en¬ 
ergy  terms  are  approximately  locally  quadratic  with 
the  pixel  difference.  (An  energy  term  that  is  quad¬ 
ratic  generates  a  larger  penalty  for  larger  pixel  dif¬ 
ferences.  Hence  a  smooth  image,  i.e.,  an  image  with 
equal  pixel  values,  minimizes  this  energy.)  Also, 
at  small  gain  magnitudes  all  edges  in  the  image 
are  smoothed,  resulting  in  a  blurred  image.  As 
the  magnitude  of  the  gain  is  increased  the  natural- 


152  I  Ml  MVl.mii  UBDRAIIIRV  ,II)IIRN»[  VOI IIME  (»  MJMHIH  I 


•MENON 

An  Efficient  MRh  Image- Restoration  Technique  Using  Deterministic  Scale- Based  Optimization 


ly  occurring  boundaries  in  the  measured  image 
start  to  appear,  and  eventually  a  sharp  segmentation 
results. 

The  steady-state  solution  of  Equation  6  at  a  given 
gain  value  is  found  by  setting  the  gradient  of  the 
energy  to  zero  and  iteratively  solving  for  the  new  pixel 
value.  The  deterministic  technique  is  implemented 
with  a  fixed-point  iteration  around  each  pixel,  in 
which  the  pixels  are  updated  with  a  Jacobi  (fully 
parallel)  scheme  [12].  In  the  technique,  the  gradient 
of  the  energy  is  set  to  zero,  and  the  term  x-  is  updated 
based  on  the  old  values  of  its  neighbors: 


r(ncw) 

Xij 


where 


vr4  *  ^(s) 


A  _  „-s  „r(ol<l)  S  r(old)  S  r(old; 
A  ~  Si.j- 1  xi,j- 1  +  gi,j+\  xi,j+ 1  +  gi-l,jxi-U 


J 

r(old) 


Xi+\,j 


y 

,  and 


B  =  gf.j- 1  +  gu+ 1  +  g'i-u  +  &+Uj 


(8) 


In  Equation  8  the  nonlinear  term  g-  is  given  by 

jM2 


&ij 


+  /<A" 


.n2  ’ 


(9) 


where  A ;y  is  calculated  based  on  the  old  pixel  values. 
At  this  point  the  lattice  could  be  updated,  but  a 
gain  annealing  schedule  has  not  yet  been  speci¬ 
fied.  The  GDGA  technique  automatically  selects 
an  annealing  schedule  by  using  feedback  from 
the  total  system  energy  to  select  the  gain  step  size. 
The  strategy  involves  varying  the  step  in  ft  in  or¬ 
der  rn  maintain  constant  steps  in  energy.  The  ft 
stc|  ,  given  by 


A/f  = 


m: 

BE 


GO) 


In  Equation  10  a  constant  energy  step  A E  is  used, 


and  the  derivative  is  given  by 

TTTv)'  (,1) 

Note  that  in  Equations  10  and  1  1  the  gain 
term  ft  refers  to  both  the  field  and  surround 
terms. 

The  GDGA  technique  starts  at  a  small  magni¬ 
tude  of  gain  and  repeatedly  applies  Equation  8  until 
convergence,  which  typically  requires  fewer  than 
10  updates.  Then  Equation  10  is  used  to  update 
the  gain  terms  (I  and  /?S  and,  with  the  new  values,  the 
pixels  are  again  updated.  The  procedure  is  terminated 
when  the  magnitude  of  the  gain  becomes  large — 
typically,  a  value  of  10.  For  both  simulated  and  real 
images,  the  GDGA  algorithm  is  able  to  complete  the 
restoration  process  (i.e.,  achieve  sharp  segmentation 
with  noise  removal)  by  using  a  total  of  100  to  200 
applications  of  the  update  equation  (each  application, 
or  iteration,  of  Equation  8  updates  all  the  pixels  in  the 
lattice).  The  restoration  of  a  128  x  128  pixel,  8-bit 
image  requires  less  than  5  min  on  a  SUN-4  work¬ 
station. 

Further  experiments  have  revealed  that  the  auto- 


—  =  2  (A  )2 
dft  1  ’ 


FIGURE  2.  Original  (noise  free)  image  containing  a  lin¬ 
early  sloping  background  with  a  target  at  constant  range. 


VOLUME  6  NUWBIR  I  W3  THE  LINCOLN  IAB0RA10RV  JOURNAL 


153 


Ml  \l>\ 


FIGURE  3.  Th  i'  muiC|c  lit  Figum  2  has  beon  synthosi/t'd 
toi  ,t  i  ,ii  1 1*  >i  to  noise  i  at  id  (CNR)  of  10  (IB,  i  ur  respond 
ni|  to  an  image  m  whirl)  20"..  of  tin*  pixel  values  are 
anomalous,  (An  anomaly  is  defined  as  a  pixel  value  in 
the  roiiupted  imaqe  that  diffeis  fiom  the  value  in 
th.  animal  iinaue  hv  mine  than  twi>  range  i  ounts.) 
E. a<  h  pixel  m  the  image  has  8  hits  (256  quay  levels)  of 
i  ('solution. 


m. itk  gain  , 1 1 1 1 1 v. 1 1 1 1 1 it  pnuess  takes  small  steps  m  /; 
ai  small  mapnumles  of  pain  ami  low. nil  the  eml  of  tile 
.nine. ilme  takes  l.upe  steps  as  the  magniiiule  of  /< 
I'eoimes  laieei,  |  h i is  i he  (  ,1  '(  >.\  tnethoil  .ul.ipmeh 
.nlpists  the  step  si/e  in  make  elluuill  use  of  e.k  It 
iteiation,  I  lie  adaptive  natiiH'  of  the  algotiihm 
is  espeeialK  evident  when  lomp.uing  ivstin.it ions 
ol  low  ami  hipji  noise  miageiv.  Images  with  ahout 
1  f f " . i  ol  the  pixels  lornipted  with  noise  rei|uite  levvei 
than  .’(I  net. moils  foi  the  entire  testoi  at  ton.  while 
im.n.tes  with  II". .  noise  tei|mie  ahout  .’(HI  to  .’5() 
net  at  ions. 

Simulation  Results 

I  Ins  si ,  i  n  in  pieselils  i  lu  1 1 1 1 .1  h  l .  1 1 1\  e  ami  1 1  u  a  1 1 1  il.ilit  e 

I I  suit  s  i  ihla  i m  1 1  f  i  om  appl  \  i  lit;  I  he  (  .  I  ft  .  \  algoi  it  Inn 
lo  [he  M  Is  I  ini.i::,  li  st(  a  ai  mu  modi  I  desiiihed  eat 
hii.  I  lu  ah'.oiilhin  wa-  ti  sieil  on  a  simheiii  i.mi'.i 
mug.  tli.it  !;.ii!  1  »iin  loiiuptiil  wilh  in  n  si  In  a  i.ingi 
xi  1 1  si  1 1  : ;  a  an  u  el  m  nl  I!  mile  I  •  li  si  1 1 1  nil  l  n  I  lu  I  Hi  lat  till 

!  ’  I  i  .  I  In  im  a- am  im  nl  nioikl.  wlmlt  simulates 


FIGURE  4.  Results  of  MRF  lestoiation  on  the  syiitlieti. 
mnqe  image  of  Figure  3  A  foul  neatest  neuilthoi  MRF 
ptoi  essoi  was  used  to  lestoie  the  iimute  Note  that  the 
edges  have  been  neaily  peifei  tly  piesetved 

a  peak  diluting  l.isel  i  ail  a  l  seiisot  that  intio 
iliues  anomalies  into  the  im-.isuti am  til ,  was  usul  to 
lomipt  an  image  In  relaiinp  the  i.miei  lo  noise  tatio 
(f  Nlxl  ot  i he-  range  setisoi  to  the  expeiled  peiiem 
anomalies  in  the  uwasuiemeiu.  t  An  anomah  is  ile 

I I  ni'il  as  a  pixel  v.ilm  In  the  lonupieil  mupe  t  hat 
iliffecs  from  tile  x .title-  in  the  oiipinal  imape  In  mole 
ill. m  two  ranee  lotmis.i  In  this  wmk  f  \  Is  values  of 
10  ills  ami  (>  ilH  wue  used.  mi  lespomluie  to  .’II".. 
ami  ~I"n  anomalies,  tespeiiivi  Iv.  in  the  loimpieil 
unape.  Ilu  measiitemeiu  inoilel  ilms  not  assume  a 
(  ■aussi.m  ilist i ilnit ion  ami  is  based  on  lealisiu  sensoi 
im  asuii  im ails. 

I  he  original  t noise  tne  1  simhiiu  imam  shown  m 
I  ipure  .’  lonl.uiis  a  simple  shape  at  a  lOiisiam  pixil 
value  apamst  a  h.ukgioiunl  whose  pixel  values  Imeaiiv 
uu  tease  fiom  ilu  lop  lo  i hi  hotiom  ol  ilu  imapi. 
In  this  work  all  of  ilu  input  ami  lesioinl  1111.11:0  have 
S  Inis  i.’sf,  . •  1  a v  levels  ol  lesolutton.  i  igun  s  shows 
ilu  i.ui'.'.e  unape  of  1  igun  .’  afu  1  tiu  imaiti  has  hull 
lomipinl  wiih  .’I)".,  anomalies,  ami  I  iguii  l  shows 
ilk  li  suit  id  [111  \II\I  iisiiuat  lx  Hi.  I  Xi  1  pt  toi  1  lew 

I I I  si  I  epa  ill  u  s  a  I  i  lu  hoiimlaiv .  1  lu  1 1  mi  >mi  k  mi  is  m  ii  h 
pellnt.  ispiiiallv  in  liiiiii  nil;:  tin.  'loping  h.u  k 
■■loiiml.  I  iguii  5  shows  ilu  i.inge  nn.igi  ot  I  1  ■  •  1 1 1 1 


Ml  \(  >\ 


FIGURE  5.  The  im,u|t>  of  Figure  2  h. is  been  synthnsi/nrl 
foi  (i  CNR  of  6  (IB.  ■  ()tii‘s|)ondiiH|  to  an  i rn.iij**  that  has 
1\'  anomalies.  Ear  li  pixel  in  the  imago  tins  8  hits  (25ti 
( 1 1 .1 V  levels)  of  ll'SOllltlOll. 

wait  '  I"  n  an  i  >  ma  Iks.  .iihI  I  1141 1  if  ( >  show  s  t  Ik  lesiilt  ol 
MKI  resinr.11  ion.  \hlii)iii;li  the  hum. in  visual  sisiem 
>..m  bareh  ieiopni/e  the  01ini11.il  shape  .11  (Ins  noise 
level.  tile  MKI  u'stm.uion  is  able  to  to. own  tin  tin 
1I1  ih  inn  edpes  th. 11  ilili in1  (In'  taipet  s  shape.  as  show  11 
in  I  ipure  (1.  In  both  i.isis  tin-  MKI  model  pmduies  a 
pien-vv  i si  smooth  t i  s  1  ora  1  ion  ol  t hi-  input  imape.  I  lie 
1 1  si-  o|  ilk1  sipmoiil  1 1  iih  lion  l.ti  iln.m-s  noisi-  redm  turn 
'sinooilimn)  while  preservmp  sharp  aisuiniiuiuiii's 

I  l  llpi'S  I . 

Next  we  ilesi  nbe  a  i|ttant native  analysis  ol  die 
model  s  pi'ilorm.mii  based  on  ilk-  average  peivem  ol 
anomalies  m  the  restored  niiap'.  I  Ins  meastne  is  rel 
e'  am  m  ’..npet  ...  ii,;"'M  .0  .  ow.m  .  n  .  >  ,  ■;„..v  m 
I  inure  shows  a  sialisiii.il  lomparison  belween  mi 
ilipied  i manes  wuli  and  without  lesioratton.  In  die 
llpure.  e.k  h  point  rep-esents  an  aveiape  ovet  lit  runs 
ai  a  lived  noise  level.  I  he  average  peueni  anomalies 
m  the  imape  wives  an  mdk.iiion  ol  the  dillkiiltv  ol 
l.unel  detiilion;  i.e..  die  proba bi 1 1 1 v  ol  deleition  is 


Il  ivv  el 

ai  I114I111  anomah  pi 

w  uii.ims. 

lot  el  In  t  iv  e 

del  i  s 

lion,  die  peiient  ol  anomalies 

lllilsi  be  II  SS 

1  lian 

aboul  1  0"".  |  Inis  t 

lie  ’  v  s  1 1 1 1 

s  show  dial 

MKI 

lesioial ion  is  1 1 lei  live 

up  to  a  ( 

Nix  ol  aboil! 

0  dK 

\l  dlls  (  Nix  die  aveia 

41  pel  s  1  111 

ol  anomalies 

FIGURE  6.  Results  <,f  MRF  instoi.-ttion  <  11  th>-  -."timti, 
tango  imago  ol  Figuio  5  A  four  noaiost  > t . •  h  1  i  1 1 .■  < r -  MRF 
|)ror  t'ssoi  .'.ns  used  to  instore  the  im,m-  Note  that  t h •  - 
edges  have  hr'Ott  |)tosoive(l. 

is  'I".,  loi  die  input  imape.  ns".,  |oi  die  median 
tillered  imape.  and  011K  I.'1""  loi  die  MKI  lestored 
imane.  (  leal  lv  the  MKI  resioiation  piovules 
superior  detiilion  peitoimaiue  in  a  h  1  v;  1 1  noise 
on  v  iron  men  t. 

We  have  also  evaluated  die  lit t It  1  v  ol  ilk  MKI 
imane  testoi.it  ion  model  as  a  pteproi  essoi  loi  1.11 
pit  leiopnnion  111  a  noisv  en\ ttonment .  I  ipln  dil 
ieteni  bmaiv  silhouettes  btoudside  views  ol 
il  1 1 1  ilen  1  veil  kies  , I  inure  N1  were  used  in  dlls  e\ 
perimem.  1  .iih  ol  die  silhouettes  was  untiled  111  a 
IdS  x  Ids  pixel  imape  lo  simulate  i.inpe  imapes, 
vvhielt  were  dun  mnupted  with  die  1.11141'  measiiie 
: . ...  m  model  lo  proilik  1  1  manes  lonespoiulmn  to  real 
isik  i.inpe  measuieinents.  Ivvelve  sin  h  im.iites  well 
piodiked  lor  e.ie  h  ot  ilk  S  silhoueiies  loi  a  loial  ol 
l,(i  1111, 141s  pel  e.n  h  <  \K  value  I10111  (>  dK  lluoimh 
dll  dK  m  I  dK  sups,  and  .11  SO  dK.  \cvi.  dele,, 
lion  and  sepmi  111.11  ion  wen  pel  loi  11  ml  on  these  si  mil 
I. tied  taupe  nu.isuieiik ms  io  obi. un  noisv  bman 
i.mpi'  sin  es  in  vv  h  n  1 1  oul  v  1  hose  pixels  within  a  u  1 1.1111 
1.11141  ,ue  shown.  I  he  1.11141  sines  well  dun  used 
10  tram  a  Neatest  \e14hbn1  (  lassilni  1  V \(  ■  I  1 
lo  sepal. lie  the  S  dilleiem  velinlis.  \\i  found  dial 
I114.I'  f  N  Is  values  1  ill  dK1  llu  ilassilui 


I  sV 


.11  \  el  v 


•  MENON 

An  Efficient  MRF  Image- Restoration  Technique  Using  Deterministic  Scale-Based  Optimization 


100 


80 


at 

0) 

1  60 
o 
c 
ro 

c 

OJ 

u  40 

a. 


20 


0 

0  10  20  30  40 

Carrier-to-noise  ratio  (dB)  for  sensor 

FIGURE  7.  Average  percent  anomalies  for  range  imagery  as  a  function  of  sen¬ 
sor  CNR.  The  results  from  MRF  restoration  are  compared  with  median  filter¬ 
ing  and  the  case  in  which  no  processing  has  been  performed  to  restore  the  image. 
Each  data  point  represents  an  average  over  10  runs  at  a  given  noise  level. 
For  effective  target  detection,  the  psrcent  of  anomalies  must  be  less  than  about 
10%.  Thus  the  results  indicate  that  iVIRF  restoration  is  effective  up  to  a  CNR  of 
about  6  dB.  At  this  CNR  value  the  average  percent  of  anomalies  is  71%  for  the 
input  image,  55%  for  the  median-filtered  image,  and  only  4.5%  for  the  MRF- 
restored  image.  Clearly,  MRF  restoration  provides  superior  detection  performance 
in  a  high-noise  environment. 

created  a  unique  category  for  each  silhouette.  At  lower 
CNR  values  (higher  noise),  however,  the  clas¬ 
sifier  formed  extra  categories  because  exemplars  of 
the  same  target  were  sometimes  classified  into  dif¬ 
ferent  categories.  We  repeated  the  above  proce¬ 
dure  twice:  once  using  the  median-filter  technique  on 
the  corrupted  images  before  the  detection  and  seg¬ 
mentation  steps,  and  once  using  MRF  restoration. 

Figure  9  compares  MRF  restoration  with  iterated 
median  filtering  and  with  the  case  in  which  no 
processing  had  been  performed  to  restore  the  image. 

The  performance  at  each  CNR  value  is  defined  as 
the  fraction  of  the  96  examples  that  the  NNC 
has  classified  correctly.  Note  that  the  MRF  restora¬ 
tion  is  able  to  maintain  an  acceptable  level  of  perfor¬ 
mance  at  a  CNR  that  is  5  and  10  dB  lower  than  with 
the  median-filter  and  no-preprocessing  case,  respec¬ 
tively.  Thus,  with  MRF  restoration,  a  sensor  can  be 
operated  at  roughly  25%  the  power  level  required  by 


the  use  of  a  median  filter. 

Hardware  Implementation 

For  real-time  image  restoration,  the  MRF  model  can 
be  implemented  either  digitally  —on  custom  digital 
signal  processing  (DSP)  chips  or  on  a  single-instruc¬ 
tion  multiple-data  (S1MD)  computer  such  as  rhe 
Connection  machine  manufactured  by  Thinking 
Machines  Corp. — or  in  an  analog  manner  on  a 
custom  VLSI  chip. 

For  digital  implementation,  72  floating-point 
operations  are  required  per  pixel  update.  Typically,  a 
pixel  must  be  updated  about  100  times  over  the 
course  of  the  restoration.  Thus  a  256  X  256  image 
restoration  would  require  472  million  floating-point 
operations,  and  a  frame-rate  restoration  of  the  same 
image  would  require  digital  hardware  that  delivers  14 
GFLOPS  of  performance.  This  performance  level  is 
at  the  leading  edge  of  current  digital  processing  tech- 


156  Hit  LINf.ni  N  i  »B0R»mRV  JOJRNHI  VOL  ll«i  6.  NUMBER  I  VIM 


•  Ml- NON 

Ah  i (fit  nut  MR/  Ki.im.itwn  1 1  .  U.iiijiit  l  1  h  it'riHiiihih  (  /piiwu.woit 


FIGURE  8.  Binary  silhouettes  of  8  different  vehicles  that  were  used  to  evaluate  the  effect  of 
MRF  restoration  on  target  recognition.  (For  the  test  results,  see  Figure  9.) 


FIGURE  9.  Fraction  of  correctly  classified  range  slices  as  a  function  of  sensor 
CNR.  In  the  experiment,  the  binary  silhouettes  of  8  different  vehicles  (Figure  8) 
were  used  to  create  realistic  noioe-corrupted  range  measurements.  Detection 
and  segmentation  were  performed  on  these  simulated  range  measurements  to 
obtain  noisy  binary  range  slices  in  which  only  those  pixels  with  a  certain  range  are 
shown.  A  total  of  12  such  slices  was  produced  for  each  of  the  8  silhouettes  at  each 
CNR  value.  The  96  range  slices  at  each  CNR  value  were  then  used  for  training  a 
Nearest  Neighbor  Classifier  (NNC)  [15|  to  separate  the  8  different  vehicles.  Before 
being  presented  to  the  NNC,  some  of  the  range  slices  were  restored  by  the  MRF 
model  and  others  by  an  iterated  median  filter.  The  results  compare  MRF  restora¬ 
tion  with  the  median-filter  technique  and  with  the  case  in  which  no  processing 
has  been  performed. 


•  MENON 

An  Efficient  MRB  Image-Restoration  Technique  Using  Deterministic  Scale- Bused  Optimization 


Surround  resistor 


FIGURE  10.  System  architecture  implemented  as  a  re¬ 
sistive  grid  (compare  with  Figure  1).  The  voltages  Vm 
and  V'  represent  the  measured  and  restored  images, 
respectively.  Note  the  presence  of  both  field  and  sur¬ 
round  resistors. 


nology.  The  advantage  of  an  all-digital  implementa¬ 
tion  is  that  it  does  not  require  the  hardwiring  of  any 
of  the  system  parameters. 

An  all-analog  implementation  requires  the  imple¬ 
mentation  of  the  energy  function  as  an  analog  circuit. 
In  the  current  example  the  system  architecture  can  be 
represented  as  a  resistive  grid  (Figure  10).  The  resis¬ 
tors  are  nonlinear  in  that  their  resistances  are  voltage 
dependent: 


R( A,)  - 


/<v2 


The  throughput  is  limited  by  the  input/output  onto 
and  oft  the  analog  chip,  and  not  the  circuit.  With 
current  technology,  images  can  be  restored  at  a  rate  in 
the  thousands  of  frames  per  second. 

Summary 

An  efficient  Markov  Random  Field  (MRF)  based 
method  for  performing  piecewise  smooth  image  res¬ 
torations  has  been  demonstrated.  The  underlying 
model  uses  a  neural  network  sigmoid  potential  be¬ 
tween  pixel  pairs  to  allow  the  formation  of  sharp 
boundaries  between  dissimilar  regions  in  the  presence 
of  noise.  A  novel  deterministic  method — called 
Gradient  Descent  Gain  Annealing  (GDGA) — for  solv¬ 
ing  the  nonlinear  coupled  set  of  differential  equations 
that  the  MRF  model  introduces  was  presented.  The 
GDGA  algorithm  typically  requires  fewer  than 
200  iterations  to  restore  an  image,  where  the  number 
of  iterations  is  roughly  proportional  to  the  level  of 
noise  in  the  image.  Computer  simulations  on  noisy 
images  have  shown  that  restorations  can  be  performed 
for  very  high  noise  levels  (i.e.,  images  that  have  up  to 
71%  of  their  pixels  corrupted  with  non-Gaussian 
sensor  noise).  Simulation  results  indicate  that  MRF 
restoration  provides  a  5-dB  advantage  in  the  carrier- 
to-noise  ratio  (CNR)  over  conventional  iterated  me¬ 
dian  filtering.  Although  the  same  model  is  currently 
used  to  restore  images  from  different  sensors,  arbi¬ 
trary  potentials  can  be  incorporated  for  the  pixel  in¬ 
teractions  so  that  the  system  can  be  tailored  to  specific 
natural  scenes  and  sensors.  The  system  uses  a  mas¬ 
sively  parallel  set  of  local  neighborhoods  (four  nearest 
neighboring  pixels)  for  efficient  implementation  on 
a  parallel-processing  computer  or  a  custom  analog 
VLSI  chip. 


where  Ap  refers  to  the  voltage  difference  between  a 
pair  of  adjacent  sites.  The  nonlinear  resistor  is  essen¬ 
tially  the  inverse  of  Equation  9,  and  the  circuit  con¬ 
sists  of  separate  field  and  surround  resistors,  as  shown 
in  Figure  10.  At  steady  state  there  is  a  current  balance 
at  every  site  and  the  voltages  Vr  correspond  to  the 
intensities  in  the  restored  image.  The  advantage  of 
this  implementation  is  that  there  is  essentially  a  “pro¬ 
cessor”  at  every  site,  and  the  processing  speed  is  lim¬ 
ited  only  by  the  settling  time  of  the  analog  circuit. 


Acknowledgments 

The  author  wishes  to  thank  members  of  the  Opto- 
Radar  Systems  Group  at  Lincoln  Laboratory  for  very 
helpful  technical  discussions  and  computer  support 
throughout  the  course  of  this  research.  In  particular, 
the  author  is  indebted  to  William  M.  Wells  III 
for  many  helpful  discussions  and  suggestions,  and 
for  help  in  reviewing  the  existing  image-restoration 
literature. 

This  work  was  sponsored  by  the  Defense  Advanced 


158  THE  LINCOLN  UBORAIORV  JOURNAL  VOLUME  6  NUMBER  1.  1993 


•MENON 

Art  Efficient  MRE  Image-Restoration  Technique  Using  Deterministic  Scale- Based  Optimization 


Research  Projects  Agency. 


REFERENCES 


1.  G.L.  Bilbro  and  W.E.  Snyder,  “Range  Image  Restoration 
Using  Mean  Field  Annealing,”  in  Advances  in  Neural  Informa¬ 
tion  Processing  Systems  I,  D.S.  louretzky,  ed.  (Morgan  Kaul- 
mann,  San  Mateo,  CA,  1989),  pp.  594—601. 

2.  D.  Geiger  and  F.  Girosi,  “Parallel  and  Deterministic  Algo¬ 
rithms  for  MRFs:  Surface  Reconstruction  and  Integration," 
IF.F.E  Trans.  Pattern  Anal.  Mach.  Intel l.  13,  401  (1991). 

3.  S.  Geman  and  D.  Geman,  “Stochastic  Relaxation,  Gibbs  Dis¬ 
tributions,  and  the  Bayesian  Restoration  of  Images,"  IEEE 
Trans.  Pattern  Anal.  Mach.  Intel/.  6,  721  (1984). 

4.  D.F.  Rumelhan,  G.E.  Hinton,  and  R.J.  Williams,  “Learning 
Internal  Representations  by  Error  Propagation,”  in  Parallel 
Distributed  Processing:  Explorations  in  the  Microstructure  of 
Cognition,  Vol.  2,  D.E.  Rumelhart  and  J.L.  McClelland, 
eds.  (MIT  Press,  Cambridge,  MA,  1986),  p.  422. 

5.  E.  Aarts  and  J.  Korst,  Simulated  Annealing  and  Boltzmann 
Machines  (John  Wiley,  NY,  1989),  p.  130. 

6.  S.  Kirkpatrick,  C.D.  Gelatt,  Jr.,  and  M.P.  Vecchi,  “Optimiza¬ 
tion  by  Simulated  Annealing,”  Science  220,  671  (1983). 

7.  N.  Metropolis,  A.  Rosenbluth,  M.  Rosenbluth,  A.  Teller,  and 
E.  Teller,  “Equation  of  State  Calculations  by  Fast  Computing 
Machines,"/  them.  Phys.  21,  1087  (1953). 

8  J.L.  Marroquin,  “Probabilistic  Solution  of  Inverse  Problems," 
Ph.D.  Thesis,  Dept,  of  Electrical  Engineering  and  Computer 
Science,  MIT,  Cambridge,  MA,  1985. 

9.  J.  Leinbach,  “Automatic  Local  Annealing,”  in  Advances  in 
Neural  Information  Processing  Systems  I,  D.S.  Touretzky,  ed. 
(Morgan  Kaufmann,  San  Mateo,  CA,  1989),  pp.  602-609. 

10.  A.  Blake  and  A.  Zisserman,  Visual  Reconstruction  (MIT  Press, 
Cambridge,  MA,  1987),  p.  131. 

1 1 .  Y.G.  LeClerc,  “Constructing  Simple  Stable  Descriptions  for 
Image  Partitioning,"  Inti.  J.  Comput.  Vision 3, 73  (May  1990). 

12.  S.D.  Conte  and  C.  de  Boor,  Elementary  Numerical  Analysis: 
An  Algorithmic  Approach,  3rd  ed.  (McGraw-Hill,  NY,  1980), 

p.  226. 

13.  M.B.  Mark,  “Multipixel,  Multidimensional  Laser  Radar  Sys¬ 
tem  Performance,"  Ph.D.  Thesis,  Dept,  of  Electrical  Engineer¬ 
ing  and  Computer  Science,  MIT,  Cambridge,  MA,  1986. 

14.  A.B.  Gschwendtner,  R.C.  Harvey,  and  R.J.  Hull,  "Coherent 
IR  Laser  Technology,”  Optical  and  Laser  Remote  Sensing,  D.K. 
Killingerand  A.  Mooradian,  eds.  (Springer-Verlag,  NY,  1983), 
p.  327. 

15.  R.O.  Duda  and  P.E.  Hart,  Pattern  Classification  and  Scene 
Analysis  (John  Wiley,  NY,  1973). 


V01UMI  6.  tiUMBtS  1  1993  lHt  V1NC0LN  UBOMU09V  J0UBNKI 


159 


•  MENON 

An  llffieieni  A/A’/-  Image Restoration  / eehnitjue  l 's mg  Determinism  Stale-Hated  Optimization 


MIIKAI.I  M.  Ml-.  NON 
is  currently  a  research  staff 
ntcmlxr  in  the  Opto- Radar 
Systems  Croup.  He  received  a 
IAS.,  an  MS.,  and  a  I’h.D. 
degree  in  chemical  engineering 
from  Case  Western  Reserve- 
University  in  Cleveland.  His 
research  interests  include  ap¬ 
plied  pattern  recognition,  signal 
processing,  and  image  process¬ 
ing,  with  special  interests  in 
wavelets  and  artificial  neural 
networks.  I  le  has  spent  the  past 
six  years  at  Lincoln  l  aboratory 
working  on  applications  of 
neural  networks  for  processing 
sensor  data,  including  the 
design  of  automatic  target 
recognition  (AI  R)  systems. 


160  "if  I'V.niN  iftB  >KAii)i<Y  .inn'iNAt  vnmvi  f>  \hvkik 


Machine  Intelligent 
Automatic  Recognition  of 
Critical  Mobile  Targets  in 
Laser  Radar  Imagery 

Richard  L  Delanoy,  Jacques  G.  Verly,  and  Dan  E.  Dudgeon 

■  A  variety  of  machine  intelligence  (MI)  techniques  have  been  developed  at 
Lincoln  Laboratory  to  increase  the  performance  reliability  of  automatic  target 
recognition  (ATR)  systems.  Useful  for  recognizing  targets  that  are  only 
marginally  visible  (due  to  sensor  limitations  or  to  the  intentional  concealment 
of  the  targets),  these  MI  techniques  have  become  integral  parts  of  the 
Experimental  Target  Recognition  System  (XTRS) — a  general-purpose  system 
for  model-based  ATR.  Using  laser  radar  images  collected  by  an  airborne  sensor, 
the  prototype  system  recognized  a  variety  of  semi-trailer  trucks  with  high 
reliability,  even  though  the  trucks  were  deployed  in  high-clutter  environments. 


Tin-  construction  of  an  automatic  target  rec¬ 
ognition  (ATR)  system  is  a  demanding  task. 
ATR  systems  must  be  able  to  locate  and  iden¬ 
tify  specific  targets  that  can  be  concealed  intention¬ 
ally  through  obscuration  or  camouflage,  that  are  of¬ 
ten  designed  to  be  nearly  invisible  in  radar  imagery, 
and  that  can  be  deployed  in  the  midst  of  distracting 
signals,  lb  gain  tactical  advantage,  it  is  generally  im¬ 
portant  that  an  ATR  system  be  able  to  find  a  target 
from  as  far  away  as  possible.  Under  such  conditions, 
the  selectively  indicative  signal  features  (signatures) 
associated  with  a  target  are  often  barely  discernible 
from  the  background.  Thus,  inevitably,  practical  ATR 
systems  must  be  able  to  discriminate  targets  from 
background  in  spite  o  weak,  ambiguous,  uncertain, 
variable,  ->r  even  contiadictory  evidence. 

ATR  system  development  can  be  particularly  diffi¬ 
cult  under  certain  mission  constraints  and  when  the 
costs  of  system  error  are  high.  One  such  AI  R  applica¬ 
tion  is  the  use  of  airborne  sensors  to  recognize  strate¬ 
gic  relocatable  targets  (SR  I  )  such  as  the  SS-25  ICBM 


of  the  former  Soviet  Union  (Figure  1  ( left ] ).  An  A  I  R 
system  for  recognizing  SRTs  must  search  through 
images  generated  by  one  or  more  sensors  (laser  radars, 
real-  or  synthetic-aperture  radars,  passive  infrared  im¬ 
agers,  and  video  cameras),  requiring  techniques  of 
data  fusion.  The  search  is  for  a  very  small  number  of 
targets  in  a  continent-sized  area.  The  targets  might  be 
caught  in  the  open,  but  more  likely  will  be  found 
along  tree  lines,  perhaps  partially  occluded  by  foliage. 
Because  of  the  nature  of  the  targets,  a  high  probability 
of  detection  is  crucial.  And  yet  the  A1  R  system  must 
generate  few  false  alarms  (FA)  due  to  mission  limits 
on  the  number  of  weapons  that  an  aircraft  can  carry 
to  destroy  the  SRTs,  the  flying  time  of  the  aircraft 
over  the  target  area,  and  the  processing  capabilities  of 
human  operators  who  must  decide  which  detections 
to  pursue.  A  closely  related  mission  is  the  detection  of 
Scud  launchers,  such  as  those  used  by  Iraq  in  the 
Persian  Gulf  War.  A  general  term,  critical  mobile 
target  (GM  I  ),  refers  to  all  mobile  missile  launchers, 
including  those  used  with  NS-2S  and  Scud  missiles. 


FIGURE  1.  Photographs  of  (left)  mobile  missile  launcher  carrying  a  strategic  SS-25  ICBM  of  the  former  Soviet  Union,  as 
shown  in  a  Soviet  newspaper,  and  (right)  tank  truck  used  by  Lincoln  Laboratory  as  a  substitute  vehicle  to  develop  and  test 
an  automatic  target  recognition  ( ATR)  system  for  detecting  critical  mobile  targets  (CMT)  such  as  the  SS-25  launcher. 


lb  achieve  reliable  detection  and  recognition  per¬ 
formance  in  such  demanding  applications,  we  have 
developed  several  new  machine  intelligence  (MI)  tech¬ 
niques,  including  new  approaches  for  model-based 
classification  [1  3],  automatic  learning  of  models  [4], 
knowledge-based  signal  processing  [3,  6|,  selective 
attention  I”-],  and  pixel-level  data  fusion  [7],  The 
F.xperimental  Target  Recognition  System  (X  I  RS)  de¬ 
veloped  at  Lincoln  Laboratory  [8-1  ()|  provides  a  frame¬ 
work  for  the  application  of  these  techniques  and  for 
the  rapid  prototyping  of  A I  R  systems.  I  hough  X  I  RS 
and  these  new  MI  techniques  were  intended  specifi¬ 
cally  for  ATR,  they  constitute  a  general-purpose  ap¬ 
proach  to  object  recognition,  with  many  potential 
applications.  For  example,  X  I  RS  has  been  applied 
successfully  to  the  detection  and  tracking  of  hazard¬ 
ous  weather  phenomena,  as  described  in  the  article 
"Machine  Intelligent  (lust  Front  Detection  by  Rich¬ 
ard  I..  Delanov  and  Seth  W.  Iroxel  in  this  issue  [ !  I  j. 
In  the  current  article,  we  apply  X  I  RS  to  the  detection 
and  recognition  of  CM  Is,  specifically,  tank  trucks 
(Figure  1  [right])  and  logging  trucks  used  as  substi¬ 
tutes  for  missile  launchers. 

Low-Level  Machine  Intelligence 

Computer  vision  systems  have  traditionally  been  de¬ 
signed  in  terms  of  a  hierarchy  of  levels.  Low-level 


vision  works  on  a  domain  of  pixel-level  data.  Because 
the  associated  image  processing  operations  are  highly 
repetitive  and  therefore  relatively  slow,  low-level  op¬ 
erations  tend  to  be  kept  simple.  And,  because  a  scene 
can  contain  many  objects  of  potential  interest,  low- 
level  operations  tend  to  be  generic  and  relatively  de¬ 
void  of  object-dependent  knowledge.  A  typical  low- 
level  operation  is  edge  detection.  In  high-level  vision, 
the  pixel-level  data  (for  the  edge-detection  example,  a 
pair  of  images  showing  the  strength  and  orientation 
of  edges)  are  transformed  into  symbolically  described 
features.  Object  identification  is  then  performed  by 
matching  these  features  against  prior  knowledge  of 
object  characteristics. 

This  basic  organization  of  computer  vision  lias 
often  been  used  in  the  design  of  ATR  systems.  In  the 
detection  process,  which  is  analogous  to  low  level 
vision,  a  threshold  is  applied  to  a  set  of  signals.  I  he 
signals  can  come  directly  from  a  sensor  or  they  can  be 
d'e  result  of  a  signal  processing  operation.  For  the 
threshold  to  be  effective,  the  signals  associated  with 
targets  must  form  a  distribution  that  is  distinguish¬ 
able  from  the  distribution  of  signals  associated  with 
other  objects  in  the  background  (i.e.,  clutter).  Figure 
2(a)  illustrates  this  point.  In  am  realistic  detection 
problem,  there  will  exist  some  targets  that  have  sig¬ 
nals  below  the  detection  threshold;  those  targets  will 


162 


•  DELANOY  ET  AL. 

Machine  Intelligent  Automatic  Recognition  of  (.ritual  Mobile  largely  in  Laser  Ruiar  Imagery 


not  he  detected.  There  will  also  exist  instances  o! 
clutter  associated  with  signals  that  are  above  the  de¬ 
tection  threshold;  such  instances  will  result  in  FAs. 

Developers  of  A  I  R  systems  have  generally  followed 
the  strategy  of  keeping  low-level  vision  devoid  of 
object-dependent  knowledge,  and  the  processing  done 
in  preparation  tor  the  application  of  thresholds  is 
usually  kept  simple.  As  a  result,  the  thresholds  must 
be  set  fairly  low  to  maximize  the  likelihood  of  detect¬ 
ing  targets.  The  high-level  recognition  process  is  then 
responsible  for  suppressing  as  many  FAs  as  possible. 
Usually,  FA  suppression  is  acco  nplished  through  the 
use  of  classifiers  based  on  statistical  techniques  or  Ml 
techniques,  including  those  involving  expert  systems, 
model-based  matching,  and  neural  networks.  How¬ 
ever,  when  a  given  A  I  R  application  involves  only  one 
or  at  most  a  few  intended  targets  or  classes  of  targets, 
the  use  of  object-dependent  knowledge  for  the  detec¬ 
tion  process  is  feasible.  And,  in  fact,  because  targets 
are  often  hidden,  camouflaged,  or  otherwise  only  mar¬ 
ginally  visible,  object-dependent  knowledge  can  play 
an  important  role  in  enhancing  detectability.  In  par¬ 
ticular,  detection  performance  can  be  improved  by 
separating  the  distributions  of  targets  and  clutter,  as 
shown  in  Figure  2(b).  Although  various  techniques 
have  been  developed  to  increase  this  separation  (see, 
for  example,  Reference  12),  the  techniques  do  not 
typically  involve  any  detailed  object-dependent 
knowledge. 

Thus,  in  addition  to  using  Ml  techniques  in  the 
conventional  role  of  high-level  classification  and  FA 
suppression,  we  have  developed  a  set  of  new  tech¬ 
niques  for  low-level  MI.  Thresholds  are  still  an  un¬ 
avoidable  part  of  detection  but,  when  low-level  Ml  is 
applied  directly  to  the  pixel-level  data  early  in  the 
detection  process,  the  use  of  thresholds  is  in  a  relative 
sense  postponed  and,  as  a  result,  made  more  effective. 

Interest  Images 

A  key  to  implementing  low-level  MI  in  XTRS  is  the 
concept  of  interest  and  interest  image  [7].  By  our  defi¬ 
nition,  interest  is  a  dimensionless  quantity  indicating 
the  likelihood  that  a  specific  feature,  indicative  of  a 
target  or  class  of  targets,  is  present  at  a  given  image 
pixel.  A  spatial  map  of  such  interest  values,  each 
constrained  to  the  range  [0,1],  constitutes  an  interest 


image.  Clusters  of  high  interest  values  are  used  as  a 
guide  to  focus  computational  resources  on  likely  tar¬ 
gets.  In  Figure  2(a),  a  threshold  was  applied  to  the 
quantity  “signal  strength.”  For  simple  AI  R  systems, 
the  signal  is  typically  the  intensity  of  returned  electro¬ 
magnetic  energy  or  a  simple  function  thereof.  Interest 
provides  an  alternative  flexible  metre  to  which  thresh¬ 
olds  can  also  be  applied.  The  power  of  this  approach 
is  that  the  output  of  any  sensor  modality  or  feature 
detector  can  arguably  be  expressed  as  an  interest  im¬ 
age.  Furthermore,  the  use  of  interest  as  a  common 
denominator  greatly  simplifies  the  fusion  of  pixel- 
level  data.  Specifically,  interest  enables  the  use  of  simple 
arithmetic  or  fuzzy  logic  to  fuse  spatial  evidence  from 
a  variety  of  sources. 

There  are  three  steps  in  low-level  MI  as  used  by 
XTRS  in  the  detection  process.  First,  relevant  feature 
detectors  are  selected,  given  knowledge  of  the  situ¬ 
ational  context.  The  context  may  include  the  intended 
set  of  targets,  various  sensor-related  parameters,  and 
identifiable  environmental  conditions  affecting  sen¬ 
sor  performance  and  target  appearance.  Often  the  use 


FIGURE  2.  Discrimination  of  target  and  clutter  signals 
through  the  application  of  a  threshold:  (a)  typical  overlap 
of  targets  and  clutter  distributions,  and  (b)  illustration  of 
how  signal  processing,  either  conventional  or  machine 
intelligence  (Ml)  based,  can  improve  detection  performance 
by  increasing  the  separation  of  the  distributions. 


•  DhlANOY  H  At. 

Machine  Intelligent  Automatic  Recognition  of  Critical  Mobile  largely  in  hater  Radar  Imagery 


of  just  one  feature  detector  can  accomplish  adequate 
target-detection  performance. 

In  the  second  step,  the  selected  feature  detectors 
are  applied  to  the  appropriately  prepared  input  imag¬ 
ery.  Kadi  detector  generates  as  its  output  an  interest 
image  that  provides  spatial  evidence  for  the  presence 
of  particular  target  features.  A  targeted  object  may  be 
represented  by  more  than  one  detector;  each  detector 
looks,  for  example,  for  a  distinct  set  of  features  or  for 
an  alternative  target  configuration. 

The  final  step  calls  for  the  fusion  of  evidence, 
which  is  accomplished  with  a  rule  of  combination 
prescribing  how  interest  values  from  multiple  interest 
images  are  to  be  combined.  The  rule  of  combination 
depends  on  the  set  of  feature  detectors  selected.  In  the 
case  of  multiple  feature  detectors  looking  tor  alterna¬ 
tive  target  configurations,  the  rule  of  combination 
could  be  the  maximum  (fttzzy-or  in  fuzzy  set  theory); 
i.e.,  at  a  specific  pixel  location,  the  maximum  of  the 
interest  values  across  all  interest  images  at  that  loca¬ 
tion  could  be  used.  In  the  case  of  several  feature 
detectors  looking  for  different  vehicle  fear  tires  that  are 
likely  to  be  present  all  the  time,  the  fusion  of  interest 
values  might  be  done  by  an  averaging  process.  Al¬ 
though  not  fully  exercised  in  the  CMT  version  ot 
XTRS,  the  rule  of  combination  could  be  arbitrarily 
complex  to  reflect  knowledge  of  the  variable  reliabil¬ 
ity  of  different  feature  detectors  under  different  view¬ 
ing  conditions. 

for  situations  in  which  only  targets  return  strong 
intensity  signals,  the  intensity  signal  returns  might 
provide  a  ready-made  interest  image.  In  practice,  how¬ 
ever,  laser  intensity  can  be  an  unreliable  discriminant 
because  the  energy  returned  from  a  target  surface 
depends  on  the  specularity  of  the  surface  and  its 
orientation  relative  to  the  incident  laser  beam.  Also, 
high  (or  low)  range  values  from  a  laser  radar  are 
usually  unreliable  predictors  of  target  locations  be¬ 
cause  targets  are  not  customarily  parked  at  the  highest 
(or  lowest)  points  in  a  locality.  Thus  the  effective  use 
of  laser  radar  imagery  requires  that  objects  in  the 
imagery  be  identified  also  on  the  basis  of  shape. 

Functional  Template  Correlation 

In  studying  the  principal  techniques  for  shape  analy¬ 
sis,  we  found  that  the  basic  equations  of  cross-correla¬ 


tion  and  mathematical  morphology  (MM)  [13]  can 
be  generalized  into  a  single  class  ot  operations,  which 
we  have  called  functional  shape  matching.  Brieflv  de¬ 
scribed,  these  shape-analysis  tools  all  use  kernels  (struc¬ 
turing  elements  in  MM),  which  are  basically  subimages 
that  are  looked  tor  within  the  image  to  be  probed,  for 
probing  a  pixel  in  an  image,  the  origin  of  a  particular 
kernel  is  positioned  over  the  pixel's  location.  A  two- 
argument  function  is  then  applied  to  each  kernel 
value  and  the  corresponding  value  in  the  image,  (for 
cross-correlation,  the  two-argument  function  is  mul¬ 
tiplication.  for  MM  dilation  and  erosion,  the  func¬ 
tions  arc  addition  and  subtraction,  respectively.)  Next, 
an  arbitrary  operator  is  applied  to  the  function  values 
obtained  tor  the  set  ot  pixel  locations  on  the  kernel. 
(For  cross-correlation,  the  operator  is  summation,  for 
MM  dilation  and  erosion,  the  operators  are  maxi¬ 
mum  and  minimum,  respectively.) 

Eventually,  we  realized  that  functional  shape  match¬ 
ing  not  only  includes  the  classic  shape-analysis  tools, 
but  it  also  encompasses  a  variety  of  signal  processing 
techniques  rhar  have  never  been  tried  before.  From 
functional  shape  matching,  we  implemented  a  tool 
for  generalized  matched  filtering  called  functional  tem¬ 
plate  correlation  (FTC)  [5].  Whereas  the  kernel  of  the 
classic  techniques  is  a  subimage  indicating  specific 
expectations  of  image  values  for  a  successful  match, 
the  kernel  used  in  FTC  is  a  set  of  indexes,  each 
corresponding  to  a  unique  scoring  function.  Each  of 
these  scoring  functions  can  define  arbitrary  expecta¬ 
tions  for  image  values  at  each  pixel  location  on  the 
kernel.  The  outputs  of  these  scoring  functions  are 
scalar  values,  which  are  averaged  and  then  “clipped” 
to  the  range  [0,1].  (In  the  clipping  process,  those 
averaged  scores  which  are  less  than  zero  are  assigned  a 
value  of  zero,  while  those  averaged  scores  which  are 
greater  than  one  are  assigned  a  value  of  one.)  Compa¬ 
rable  in  spirit  »o  the  membership  functions  of  fuzzy 
set  theory,  scoring  functions  provide  a  means  of  en¬ 
coding  uncertainties.  But,  in  addition,  scoring  func¬ 
tions  can  be  used  to  encode  a  surprising  amount  of 
knowledge  of  the  physics  of  a  matching  problem. 
Using  FIX',  we  can  construct  customized  matching 
techniques  that  are  more  powerful  than  the  classic 
shape-analysis  operations.  (Note:  For  a  brief  intro¬ 
duction  to  FTC,  see  the  box,  “Functional  Template 


164 


•  DfclANOY  ET  AL. 

Mai  bine  I ntelligent  Automatic  Recognition  of  (  run  of  Mobile  largely  in  l  ayer  Radar  Imagery 


Correlation,”  in  the  article  “Machine  Intelligent 
Ciust  Front  Detection,”  by  Delanoy  and  Troxel  in  this 
issue  11 1  ].) 

Data  Used  in  System  Development  and  Testing 

For  the  development  and  testing  of  a  GMT  version  of 
XTRS,  a  large  dataset  of  images  in  Maine  was  col¬ 
lected  to  simulate  the  detection  of  CMTs  in  a  high- 
clutter,  mainly  forested  environment.  Semi-trailer 
trucks,  which  approximate  the  appearance  of  missile 
launchers,  were  positioned  amidst  both  natural  and 
man-made  clutter.  The  simulated  targets  included  the 
tank  truck  shown  in  Figure  1  (right),  the  same  tank 
truck  but  under  camouflage  netting,  a  loaded  logging 


truck,  and  an  empty  logging  truck.  This  variety  of 
vehicles  was  used  to  test  XTRS’s  ability  to  discrimi¬ 
nate  between  targets  of  similar  shapes  and  sizes.  The 
vehicles  were  positioned  on  or  near  roads,  both  in  the 
open  and  along  tree  lines.  The  man-made,  or  cultural, 
clutter  included  residential  neighborhoods  and  a  log¬ 
ging  camp  (Figure  3)  that  contained  heavy  logging 
equipment  such  as  other  semi-trailer  trucks. 

Pixel-registered  range  and  intensity  images  of  the 
various  vehicles  were  generated  with  the  Hughes- 
Danbury  GaAs  Laser  l.inescanner  carried  aboard  a 
Gulfstream  G-l  aircraft  (Figure  4).  Characterized  by 
a  0.85-/tm  wavelength,  the  linescanner  has  a  range 
ambiguity  of  10  m,  a  precision  of  ambiguous-range 


FIGURE  3.  Aerial  photographs  of  man-made,  or  cultural,  clutter  represented  in  the  laser-radar-image  dataset.  Contained  in 
the  photographs  are  railroad  cars,  fuel  tanks,  stacks  of  logs,  empty  logging  trucks,  other  road  vehicles,  and  heavy  logging 
machinery.  All  of  these  objects,  easily  confused  with  the  set  of  targets  being  sought,  are  potential  sources  of  false  alarms 
(FA). 


v'U!  nVf  !'  \ R  • 


16S 


•  DELANOY  ET  AL. 

Machine  Intelligent  Automatic  Recognition  of  Critical  Mobile  t  argets  in  l  aser  Radar  Imagery 


Electronics  racks 


Recording  system  j 

0.85-pm  down-looking  laser  radar  linescanner 
10.6-/rm  forward-looking  laser  radar 
8-to-12-pm  forward-looking  passive  imager 


FIGURE  4.  Lincoln  Laboratory  airborne  sensor  platform. 
The  Gulfstream  G-1  carries  a  0.85-pm  down-looking  laser 
radar  linescanner,  a  10.6-/im  forward-looking  laser  radar, 
and  an  8-to-12-/(m  forward-looking  passive  imager.  The 
images  used  in  the  experiments  described  in  this  article 
were  collected  with  the  0.85-pm  linescanner. 

values  of  15  cm,  and  an  angular  resolution  of  1.0 
mrad.  Images  were  colHcted  during  the  winter  and 
summer  of  1989  with  a  down-looking  sensor  plat¬ 
form  that  was  operated  at  altitudes  between  200  and 
300  m  by  the  Opto-Radar  Systems  Group  of  Lincoln 
Laboratory.  The  example  range  and  intensity  images 
shown  in  Figure  5  reveal  the  high  resolution  achieved 
with  the  0.85-jtm  linescanner.  Image  widths  were 
between  64  and  1  50  m;  image  lengths  could  be  arbi¬ 
trarily  long  because  of  the  linescanner  used.  Long 
scans  were  subdivided  into  overlapping  images  rang¬ 
ing  in  length  from  100  to  400  m.  In  total,  the  dataset 

166  Hi!  UMIOI  M  AHIlMIOKt1  .HllJKMl  VMUlMf  f, 


collected  contained  2303  image  pairs  (range  and  in¬ 
tensity)  covering  17.13  km"  of  ground  area  un¬ 
der  both  winter  (snow)  and  summer  (dense  foliage) 
conditions. 

System  Description 

The  architecture  of  the  CM  1  version  of  X  I  RS  con¬ 
sists  of  five  modules  (Figure  6):  preprocessing,  detec¬ 
tion,  extraction,  decomposition,  and  matching.  Flach 
module  has  a  standard  structure  (Figure  7)  that  con¬ 
sists  of  four  main  elements:  (1)  a  parameter  library — 
a  collection  of  algorithms,  numbers,  and/or  data  struc¬ 
tures  that  encode  knowledge  relevant  to  the  current 
stage  of  processing:  (2)  a  parameter  selector — a  rule- 
based  expert,  i.e.,  a  collection  of  rules,  that  uses  con¬ 
textual  information  and  previous  results  to  choose 
parameters  from  the  parameter  library;  (3)  a  generic- 
processing  engine;  and  (4)  a  rule-based  feedback  ex¬ 
pert  that  evaluates  the  output  of  the  processing  en¬ 
gine  and  decides  where  control  should  be  directed.  In 
the  complete  system,  the  feedback  expert  of  one  mod¬ 
ule  and  the  parameter  selector  of  the  subsequent  mod¬ 
ule  conceptually  form  a  local-control  node. 

Preprocessing 

The  Hughes-Danbury  GaAs  Laser  Linescanner  pro¬ 
duces  pixel-registered  range  and  intensity  images  that, 
in  preparation  for  the  detection  process,  require  a 
number  of  data  transformations. 

First,  because  an  aircraft’s  speed  with  respect  to  the 
ground  varies  depending  on  the  wind  velocity  vector, 
the  aspect  ratios  of  targets  that  have  been  imaged  by 
the  linescanner  can  often  become  distorted.  In  an 
operational  system,  these  distortions  can  be  avoided 
by  using  an  inertial  navigation  system  either  to  regu¬ 
late  the  linescan  rate  or  to  provide  data  rhat  would 
allow  the  images  to  be  corrected  by  interpolation.  For 
the  data  used  in  this  article,  the  interpolation  was 
performed  interactively  to  obtain  the  correct  target 
aspect  ratios. 

Second,  the  ambiguous-range  values  are  converted 
to  absolute  altitudes  above  some  arbitrary  reference 
altitude.  Once  the  absolute  altitudes  have  been  deter¬ 
mined,  a  map  of  altitudes  for  the  local  ground  level 
can  be  computed  with  a  technique  based  on  morpho¬ 
logical  operations  [14-16!.  In  the  technique,  only 


VIVBIB'  i'i'M 


•  DELANOY  ET  AL. 

Machine  Intelligent  Automatic  Recognition  of  (  ritu  al  Mobile  largely  in  l  ayer  R^uLir  Imagery 


those  surface  shapes  which  arc  wider  than  the  in¬ 
tended  targets  are  retained  as  part  of  the  local  ground 
level.  (Note:  There  exist  other  techniques  for  estimat¬ 
ing  the  local  ground  level:  see,  for  example,  Reference 
17.)  When  the  altitudes  of  the  local  ground  level  arc- 
subtracted  from  the  absolute  altitudes,  the  resulting 
image  will  contain  values  that  are  the  heights  of  small 
objects  (including  targets)  above  the  local  ground 
level.  In  this  article,  subsequent  uses  of  the  term  range 
image  will  refer  to  this  image  of  heights  above  the 
local  ground  level. 

For  the  last  step  of  preparation,  the  range  and 
intensity  images  are  scaled  bv  linear  interpolation  to  a 
resolution  of  0.25  m  per  pixel  side.  Images  of  lower 
resolution  (0.5  and  1.0  m  per  pixel  side)  are  then 
generated  by  a  subsampling  of  the  data.  The  lower- 
resolution  images  are  used  for  detection,  while  the 
high-resolution  images  are  used  for  extraction  and 


high-level  matching. 

(Note:  During  preprocessing,  no  attempt  was  made 
to  clean  the  linescanner  imagerv  of  noise.  Such  a 
procedure  was  unnecessary  due,  in  part,  to  both  the 
high  quality  of  the  imagery  and  the  noise-resistant 
properties  of  FTC.) 

Detection 

In  the  CMT  version  of  XTRS,  three-dimensional 
detection  is  essentially  performed  by  four  target  de¬ 
tectors  (i.e.,  feature  detectors  representing  whole  tar¬ 
gets  instead  of  individual  features).  The  tank  truck  is 
represented  by  wo  alternative  target  detectors,  one  in 
which  the  truck  is  exposed,  the  other  in  which  the 
truck  is  covered  with  camouflage  netting.  The  logging 
truck  is  similarly  represented  by  two  target  detectors, 
one  in  which  the  truck  is  empty,  the  other  in  which 
the  truck  is  loaded  with  logs. 


FIGURE  5.  Example  (a)  intensity  and  (b)  range  images  taken  during  winter  with  the  0,85-/tm  down-looking  laser  linescanner 
shown  in  Figure  4.  Note  the  tank  truck  (lower  left),  empty  logging  truck  (upper  left),  and  house  trailer  (upper  right)  in  both 
images.  The  range  image  has  been  transformed  such  that  each  pixel  value  represents  a  height  above  an  arbitrary  reference 
altitude  with  lighter  pixels  indicating  a  greater  height. 


•  DELANOY  1 1  AL. 

Moehine  Intelligent  Auloituttw  Recognition  uf  (  ritieol  Mobile  targets  in  Loser  kotior  Imogen 


Sensor  imagery 


Recognized  targets 


FIGURE  6.  Architecture  of  Experimental  Target  Recogni¬ 
tion  System  (XTRS),  showing  the  five  processing  mod¬ 
ules  as  well  as  the  way  in  which  knowledge  is  represented 
at  each  level. 

The  target  detector  for  the  exposed  tank  truck 
consists  of  the  two  functional  templates  shown  in 
Figure  8.  The  first  functional  template  encodes  the 
expected  appearance  of  the  truck  in  range  imagery 
(i.e.,  images  of  heights  above  the  local  ground  level). 
In  scoring  function  1,  which  corresponds  to  the  top 
surfaces  of  the  cab  and  trailer,  a  maximal  score  of  1 .0 
is  returned  for  heights  from  2.5  to  3.5  m.  The  uncer¬ 
tainty  comes  from  signal  noise  and  inaccuracies  in 
estimating  the  local  ground  level.  The  negative  scores 
reflect  the  fact  that  tank  trucks  are  opaque  to  laser 
illumination;  i.e.,  the  presence  of  ground-level  heights 


where  the  cab  or  trailer  is  expected  constitutes  strong 
evidence  that  a  target  is  not  present  at  that  location. 
On  the  other  hand,  heights  greater  than  the  expected 
interval  of  2.5  to  3.5  m  result  in  scores  no  less  than 
0.5,  the  level  of  ambiguity,  because  such  heights  could 
potentially  indicate  the  presence  of  an  occluding  sur¬ 
face.  In  other  words,  the  cab  of  a  tank  truck  might  or 
might  not  be  present  under  an  occluding  surface  that 
is  at  least  4.0  m  high. 

The  other  scoring  functions  work  in  the  same  man¬ 
ner,  except  that  the  expected  interval  of  heights  tor 
the  background  in  scoring  function  0  is  from  0.0  to 
0.5  m,  and  the  expected  interval  for  the  hitch  area  in 
scoring  function  2  is  from  around  1 .0  to  2.0  m.  1  hese 
scoring  functions  are  tuned  such  that,  when  the  tem¬ 
plate  is  applied  to  a  patch  of  bare  ground  (zero  height), 
the  negative  scores  from  scoring  function  1  balance 
the  positive  scores  generated  by  scoring  functions  0 
and  2,  resulting  in  an  overall  score  near  0.0.  And,  of 
course,  an  unobscured  target  should  generate  a  score 
near  1 .0. 

T  he  above  functional-template  design  provides  a 
simple  means  of  minimizing  the  effects  of  occlusion. 
This  capability,  not  easily  obtained  with  cross-correla¬ 
tion  or  MM  operations,  is  necessary  for  finding  tar¬ 
gets  partially  covered  by  foliage.  In  the  extreme,  a 
target  that  is  completely  occluded  should  generate  a 
score  of  0.5  (Figure  9). 

In  Figure  8,  the  second  functional  template  for  the 
exposed  rank  truck  is  designed  for  intensity  imagery. 
Because  the  surface  of  the  truck  is  smooth  (specular) 
with  regard  to  the  laser  wavelength,  the  reflected  laser 
beam  will  tend  either  to  miss  the  sensor  (resulting  in  a 
low  intensity  value)  or  hit  the  sensor  directly  (result¬ 
ing  in  a  high  value).  Scoring  function  4,  correspond¬ 
ing  to  the  trailer  and  cab,  encodes  these  expectations 
by  returning  high  scores  for  very  low  and  very  high 
intensity  values,  and  low  scores  for  intermediate  in¬ 
tensity  values.  The  hitch  area  returns  the  laser  energy’ 
more  diffusely;  thus,  scoring  function  5,  which  corre¬ 
sponds  to  the  hitch  area,  returns  high  scores  for  inter¬ 
mediate  intensity  values.  The  intensity  values  associ¬ 
ated  with  surrounding  ground  areas  are  highly  variable 
and  unpredictable,  except  that  they  are  seldom  very 
low.  Scoring  function  3  encodes  this  expectation  with 
highly  negative  scores  for  low  intensity  values,  but  nil 


168 


•  DFI-ANOY'  FF  AF. 

Mui'hinc  Intelligent  Amount ii  Recognition  of  (  'itii.nl  Mobile  J  n/gch  in  l  u>cr  Rnilnr  linnet  rj 


(i.e.,  no  opinion)  for  intermediate  and  high  intensity 
values. 

I  he  two  functional  templates  shown  in  Figure  8 
were  applied  simultaneously  to  the  input  range  and 
intensity  images,  and  overall  scores  were  computed  as 
the  average  of  the  scores  returned  from  the  six  scoring 
functions.  Because  the  orientations  of  targets  are  typi¬ 
cally  arbitrary  and  unknown  a  priori ,  an  FIX’  score 
was  computed  tor  each  of  36  uniformly  spaced  tem¬ 
plate  rotations  (10°  increments)  at  each  pixel  location 
of  the  input  imagery.  For  a  particular  pixel  location  in 
an  input  image,  the  score  associated  with  the  maxi¬ 
mally  scoring  orientation  was  assigned  to  the  corre¬ 
sponding  pixel  location  in  the  output  image.  As  pre¬ 
viously  indicated,  each  such  output  image  is  treated  as 
an  interest  image,  indicative  of  the  likelihood  of  find¬ 
ing  a  target  at  any  particular  pixel. 

In  a  similar  way,  output  interest  images  were  also 
generated  for  each  of  the  three  other  target  detectors, 
and  the  four  interest  images  were  combined  by  taking 


the  maximal  scores  at  each  pixel  location.  1  he  result¬ 
ing  combined  interest  image  was  then  scanned  for 
pixels  having  interest  scores  above  a  certain  threshold 
(typically  0.78),  and  the  above-threshold  pixels  were 
grouped  into  clusters. 

Next,  a  box  was  placed  around  each  cluster.  The 
boxes  were  used  to  extract  range  and  intensity 
subimages  containing  the  interest  cluster  and  thus  the 
candidate  target.  The  boxes  were  square,  with  sides 
80%  longer  than  the  longest  dimension  of  the  targets 
being  sought.  Up  to  four  boxes  with  above-threshold 
interest  scores  and  a  minimum  of  overlap  with  each 
other  were  constructed  for  each  image.  I  he  cluster  of 
above-threshold  interest  scores  that  led  to  the  creation 
of  a  particular  box  was  used  to  create  a  list  of  tar¬ 
get  hypotheses.  At  each  pixel  in  a  cluster,  the  inter¬ 
est  score  is  always  associated  with  the  highest-scoring 
target  detector  at  the  highest-scoring  oriemation.  Fateh 
pixel’s  hypothesis  consisted  of  the  highest-scoring  tar¬ 
get  detector’s  name,  pixel  coordinates,  orientation. 


Prior 

module 


r 


Data 

I. 


Contextual 

information 


Module 


Subsequent 

module 


Parameter  selector 


Parameter  library 


Processing  engine  R 


Rule-based  feedback  expert 


/  N 

/  Rule-based  feedback  exppit  ' 


Parameter  selector  ' 

y 

"  Local-contrcl  node 


From 

global 

feedback 

expert 


To 

global 
-►  feedback 
expert 


I 

I 


Data 


J 


FIGURE  7.  Standard  structure  of  the  XTRS  processing  modules  shown  in  Figure  6,  Note  that  the  feedback  expert  of  one 
module  and  the  parameter  selector  of  the  subsequent  module  conceptually  form  a  local-control  node. 


169 


•  DELANOY  E  l  Al  . 

Machine  Intelligent  Automatic  Recognition  of  (  ritical  Mobile  I argets  in  Laser  Radar  Imagery 


Intensity  index  kernel 


Hitch  area 


Hitch  area 


Scoring  functions  Scoring  functions 


Height  (m) 


Intensity 


(a)  (b) 

FIGURE  8.  Functional  template  for  the  top  view  of  the  tank  truck.  Shown  are  the  index  kernels  and  indexed  scoring  functions 
for  (a)  range  and  (b)  intensity  images.  Note  that  scoring  function  3.  i.e.,  the  scoring  function  for  the  surrounding  ground  in 
intensity  imagery,  has  no  opinion  above  a  certain  intensity  value. 


and  interest  score,  which  was  used  to  rank  the  hy¬ 
potheses.  For  each  box  generated,  information  re¬ 
garding  the  size  and  location  of  the  box  as  well  as 
hypotheses  about  what  might  be  in  the  box  was  placed 
in  a  data  structure  referred  to  as  a  “window."  Because 
the  CM  I  version  of  X  I  RS  uses  the  function  maxi¬ 
mum  for  the  rule  of  combination  of  interest  scores, 
the  score  achieved  at  any  pixel  location  by  the  high¬ 
est-scoring  target  detector  is  also  the  value  stored  at 
that  same  location  in  the  window. 


Extraction  and  Decomposition 

Windows  generated  by  the  detection  process  are  used 
as  input  to  extraction.  The  position  and  size  of  each 
window  are  used  to  extract  full  resolution  (0.25  m  per 
pixel  side)  subimages  of  range,  intensity,  and  interest. 
In  the  extraction  module  (Figures  6  and  7),  the  pa¬ 
rameter  selector  chooses  from  the  library  an  extractor 
corresponding  to  the  highest-ranking  hypothesis  in  a 
window.  In  the  current  implementation  of  our  sys- 


170 


•  DELANOY  ET  AL. 

Machine  Intelligent  Automatic  Recognition  of  (  'ritual  Mobile  Targets  in  Laser  RuiLir  Imagery 


tern,  a  full-resolution  functional  template  is  created 
for  the  extractor  by  a  zooming  process  that  is  applied 
to  the  corresponding  detection  template.  The  pre¬ 
liminary  location  and  orientation  information  recorded 
in  the  hypothesis  is  then  used  to  probe  the  window 
with  the  full-resolution  template.  The  window  is 
probed  only  at  the  pixels  immediately  surrounding 
the  hypothesis  location  and  only  for  orientations 
within  10°  of  the  hypothesis  orientation.  Although 
the  angular  increment  for  FTC  is  10°  for  detection, 
an  increment  of  1°  is  used  for  extraction. 

At  the  location  and  orientation  of  the  best  full- 
resolution  FTC  match,  a  rectangular  mask  with  out¬ 
side  dimensions  that  approximate  the  dimensions  of 
the  candidate  target  is  positioned  to  isolate  an  image 
region.  The  isolated  region  then  undergoes  the  appli¬ 
cation  of  height  thresholds  followed  by  a  cleaning 
with  MM,  and  the  resulting  region  is  subdivided  with 
a  stencil  consisting  of  an  array  of  rectangles  (six  or 
eight  in  our  application),  each  marking  the  area  limits 


of  one  subregion.  For  the  tank  truck,  Figure  10(a) 
shows  the  eight  idealized  subregions,  each  of  which  is 
characterized  with  regard  to  a  number  of  attributes 
such  as  length  and  width,  and  various  texture  mea¬ 
sures  such  as  a  measure  of  the  local  variance  in  the 
subregion.  The  characterized  object  region  and  part 
subregions  together  with  a  list  of  candidate  target 
identities  extracted  from  the  window  hypotheses  serve 
as  input  to  the  matching  process.  (Note:  If  the  match¬ 
ing  module  fails  to  make  an  identification,  control  is 
directed  back  to  the  beginning  of  the  extraction  mod¬ 
ule.  In  such  a  case,  the  extraction  process  is  repeated 
for  the  hypothesis  that  has  the  next  highest  interest 
score.  The  processing  stops  either  when  the  target  has 
been  identified  or  when  all  hypotheses  have  been 
examined.) 

Matching 

Candidate  targets  are  identified  by  matching  the  ob¬ 
ject  region  and  part  subregions  against  appearance 


•  DKLANOY  Kl  Al.. 

Mdihnn'  InuHl^iiil  Auii/Mutn  fu  locution  of  (  nlnal  Mobil,  Id)  o,:>  in  1  do  >  h'diLn  I  oid". 


Object  node 


8  7 


-SUBREGION  7 
LENGTH  5.1 
WIDTH  1.4 

HEIGHT  2.1 

• 

TEXTURE  7.6 


Part  nodes 


Constraints 


6  5 


-SUBREGION  5 
LENGTH  5.2 
WIDTH  1.4 
HEIGHT  1.9 


TANK  TRUCK 


4  3 


TEXTURE  8.2 


LENGTH 


and  W, 


WIDTH 


and  147, 


Object  node 


PART  1 


PART  2 


LENGTH 


and  W ,  LENGTH 


and  W, 


and  W0 


FIGURE  10.  Appearance  models:  1 "  |  /  \  a,,u  "i  1  n  ]  /  \  a"u  r,i 

(a)  top  view  of  an  idealized  tank-  1  1  ••• 

truck  region  decomposed  with  a  WIDTH  /  \  and  W2  WIDTH  /  \  and  W2 

target  stencil  into  eight  charac-  0  — ‘ ^ -  0  — *• — - — * -  # 

terized  subregions,  and  (b)  cor-  J  •  •  s  *  *  *  s 

responding  appearance  model  Tj  “T  ' 

(AM)  for  the  tank  truck.  Note  that  Part  nodes  I 
each  of  the  eight  subregions  is  I 

characterized  with  regard  to  a  A  :f-  - •  -v-.  ... 

number  of  attributes  such  as  r  ‘  '  ““  "ri  1  "  . 

length  and  width,  and  various  tex-  COMBINED  1  / — \  • 

ture  measures.  In  the  AM,  an  ob-  1  ^  VVIDTH  /  \  and  Hz'  Wps®  7’ 

|ect  node  (TANK  TRUCK)  is  bro-  ^  V  ^ ■  • 

ken  into  eight  part  nodes  (PART  \  ■ 

1  through  PART  8)  correspond-  Vj  SAME  1  i  / - V  ip&\'  I 

mg  to  the  eight  subregions.  Each  Constraints  HEIGHT  I  /  \  and  W  V  . 

object  and  part  node  contains  a  V _ _ _ _  */■'' ’  .. ;. 

set  of  fuzzy  predicates  that  define  *■" “ ■—“■■■■■— mmm ^ — i -■ — <•■*« 

the  allowable  limits  for  computed  (b) 

values  of  the  different  attributes 

such  as  length  and  width.  Each  predicate  has  an  associated  weight  W,  that  is  used  to  bias  computed  match  scores.  In  a 
similar  way,  constraints  (e.g..  COMBINED  WIDTH  and  SAME  HEIGHT)  specify  the  limits  of  the  relationships  between  the 
different  parts.  The  AM  shown  here  has  been  simplified.  In  practice,  the  AMs  of  the  modeled  trucks  have  as  many  as  80 
constraints  between  the  different  parts.  Note,  also,  that  for  the  sake  of  simplicity,  the  existence  of  constraints  between 
certain  parts  (e.g.,  between  PART  2  and  PART  4)  has  not  been  shown. 


r 

COMBINED 

1  l~\ 

WIDTH 

o-l-A 

^ _ and  W  j 

r 

SAME 

1  1  / - \ 

1 

HEIGHT 

nLC_! 

^ _ and  W  J 

•  DHANOY  KT  Al. 

Mill luHi'  friti’ifigrM  AitloMiilii  Rcio^iitmn  of  (  run, if  Mobile  i .ir^ih  in  /  ./></  Hutbir  hn.ioi  r\ 


models  (AM)  [  1  — 3 ) .  Figure  10(b)  illustrates  the  gen¬ 
eral  construction  of  an  AM  for  the  tank  truck.  Note 
that  the  AM  consists  of  an  object  node  (TANK 
I'RUCiK)  and  a  series  of  part  nodes  (e.g.,  PARI  I) 
that  specify  the  limits  of  properties  of  the  different 
parts.  Attributes  of  the  object  region  and  part  subre¬ 
gions  can  include  length,  width,  aspect  ratio,  circum¬ 
ference,  average  image  value,  and  texture  measure¬ 
ments,  among  other  quantities,  bach  object  and  part 
node  contains  a  set  of  fuzzy  predicates  and  associated 
weights  that  define  the  allowable  limits  for  the  com¬ 
puted  values  of  the  different  attributes.  There  is  typi¬ 
cally  one  fuzzy  predicate  for  each  attribute.  In  a  simi¬ 
lar  way,  constraints  (e.g.,  COMBINED  WIDTH  and 
SAMK  HEICiHT)  specify  the  limits  of  the  relations 
between  parts. 

By  treating  a  computed  attribute  l> as  the  argument 
of  the  corresponding  fuzzy  predicate  /(.v),  we  can 
easily  obtain  a  score  /(ff)  for  the  computed  value  0. 
I  he  scores  obtained  from  a  set  of  fuzzy  predicates 
together  with  the  weights  associated  with  those  predi¬ 
cates  can  then  be  used  to  calculate  a  weighted  average 
that  provides  an  overall  match  score  tor  each  part. 
Similarly,  a  match  score  can  be  computed  for  each 
constraint.  For  example,  the  sum  of  the  widths  of 
PARI  I  and  PART  2  would  be  the  input  to  the 
constraint  COMBINED  WIDTH  shown  in  Figure 
10(b).  Match  scores  for  each  part  and  each  constraint 
become  pieces  of  evidence  that  can  then  be  combined 
with  the  Dempster-Shafer  theory  of  evidence.  I  he 
output  is  a  target  identity,  which  may  be  none  to 
indicate  an  unknown  target  type.  (Note:  References  1 
through  3  provide  a  detailed  description  of  matching 
based  on  AMs,  including  a  description  of  the 
Dempster-Shafer  theory  of  evidence.) 

Using  the  above  approach,  we  constructed  five 
AMs,  one  each  for  the  exposed  tank  truck,  the  cam¬ 
ouflaged  tank  truck,  the  loaded  logging  truck,  the 
empty  logging  truck,  and  the  truck  cabs.  I  he  cabs 
were  modeled  through  a  separate  AM  because,  in 
several  cases,  the  frame  boundary  of  the  images 
had  occluded  the  trailers. 

I  h  rough  experience,  we  learned  that  the  AMs  that 
were  more  successful  were  generally  more  compli¬ 
cated.  As  the  size  and  complexity  of  the  AMs  grew, 
however,  it  became  apparent  that  we  could  not  con¬ 


tinue  to  construct  AMs  manually.  I  hits  automatic 
construction  techniques  were  needed. 

Automatic  model  building  requires  example  sets  of 
the  decomposed  targets,  lor  each  attribute  of  each 
part  of  each  target,  fuzzy  predicates  can  be  constructed 
from  the  population  of  values  found  in  the  example 
set.  Figure  I  I  shows  a  fuzzy  predicate  that  has  been 
constructed  for  the  attribute  ITNCiTH  of  the  part 
node  BAR  I  1.  The  red  dots  at  the  top  of  the  figure 
represent  the  population  of  length  values  from  all 
BARI  Is  in  the  example  set.  During  the  construction 
of  a  fuzzy  predicate,  outlier  (i.e.,  statistically  inconsis¬ 
tent)  values  are  discarded,  and  a  cluster  analysis  is 
performed  to  determine  the  number  of  clusters  that 
might  best  explain  variances  in  the  remaining  values. 
For  each  cluster,  the  mean  and  standard  deviation  are 
computed,  and  an  interval  of  maximum  returned 
score  (1.0)  is  established  between  the  minimum  and 
maximum  lengths  of  each  cluster.  Outside  this  inter¬ 
val  of  maximum  returned  score,  the  fuzzy-predicate 
curve  ramps  down  from  1 .0  to  ().()  with  a  slope  that  is 
proportional  to  the  standard  deviation  oof  the  cluster 
population.  The  value  of  < t  is  multiplied  by  the  coeffi¬ 
cient  /i  called  the  recognition  tolerance,  to  determine 
the  width  of  the  ramping  interval.  For  small  values  of 
/J,  the  fuzzy  predicate  is  relatively  intolerant  of  lengths 
that  are  outside  the  already  observed  range  of  values, 
while  high  values  of  /i  result  in  a  greater  tolerance 
of  such  variations.  Ihe  final  fuzzy  predicate  is  the 
maximum  of  the  individual  functions  generated  for 
each  cluster.  I  he  weight  associated  with  each  fuzzy 
predicate  is  initialized  to  0.1,  a  value  chosen  to  allow 
an  increase  (and  decrease)  bv  at  least  an  order  of 
magnitude. 

Fuzzy  predicates  are  constructed  for  each  attribute 
of  each  part.  Not  all  attributes,  however,  are  equally 
effective  in  discriminating  targets  from  clutter.  To 
determine  which  attributes  are  effective  discriminants, 
we  use  a  second  phase  of  model  building  called  super¬ 
vised  discrimination  /earning.  In  the  process,  weights 
associated  with  attributes  that  are  weakly  discriminat¬ 
ing  are  decreased,  while  weights  for  attributes  that  are 
strongly  discriminating  are  increased.  Whether  an  at¬ 
tribute  is  discriminating  or  not  is  determined  bv  indi¬ 
vidually  reevaluating  each  attribute  within  the  AMs 
of  targets  after  an  incorrect  identification  has  been 


•  DELANOY  ET  AL. 

Machine  Intelligent  Automatic  Recognition  of  Critical  Mobile  Targets  in  Laser  Radar  Imagery 


made.  It  a  tuzzy  predicate  returned  a  high  score  that 
contributed  to  the  error,  then  the  associated  attribute 
is  nondiscriminaring  and  the  corresponding  weight  is 
decreased.  For  example,  consider  the  response  to  an 
FA  in  which  some  piece  of  clutter  has  been  incorrectly 
identified  as  a  target.  In  the  identification  process, 
fuzzy  predicates  were  evaluated  tor  the  different  at¬ 
tributes.  If  the  score  from  a  particular  evaluation  was 
greater  than  0.5  (ambiguity),  then  that  attribute  con¬ 
tributed  to  the  mistaken  identity  and  is  thus  not 
discriminatory;  consequently,  the  corresponding 
weight  is  decreased.  On  the  other  hand,  if  the  score 
was  less  than  0.5,  indicating  that  the  attribute  had 
correctly  denied  the  mistaken  identity  but  was  out¬ 
voted  by  the  other  fuzzy  predicates,  then  the  associ¬ 
ated  weight  is  increased.  (Note:  Reference  4  contains 
specific  equations  and  schedules  for  the  weight  ad¬ 
justments,  along  with  a  more  detailed  description  of 
supervised  discrimination  learning.) 

Results 

Much  of  the  innovation  of  the  CMT  version  of 
XTRS  is  in  the  development  of  techniques  for  low- 
level  MI.  To  evaluate  the  effectiveness  of  these  tech¬ 


niques,  this  section  will  present  the  detection  results 
first,  separate  from  the  results  of  the  overall  system 
recognition  performance. 

Detection  Performance 

Each  of  the  four  FTC-based  target  detectors  was  tested 
individually  for  a  range  of  interest  thresholds.  Figure 
12  shows  the  probability'  of  detection  A)  plotted  as  a 
function  of  the  false  detection  (FD)  rate  for  the  four 
implemented  detectors.  (Note:  FD  is  distinct  from 
FA,  which  is  the  false-alarm  level  for  the  overall 
system.) 

For  the  tank-truck  detectors,  both  exposed  and 
camouflaged,  the  detection  performance  was  quite 
good.  In  both  cases,  was  around  0.7  at  the  thresh¬ 
old  level  where  the  first  FD  occurred.  Given  the 
17.13  km"  of  ground  area  covered  by  the  dataset,  the 
one  FD  resulted  in  a  rate  of  0.058  FD/km".  For  a  P^ 
of  1 .0,  the  associated  minimum  FD  rate  was  approxi¬ 
mately  2.0  FD/km".  The  target  detector  for  the  loaded 
logging  truck  performed  slightly  less  well.  Because  the 
shape  of  the  vehicle  changed  with  each  load  of  logs, 
the  detectors  functional  template  had  to  be  con¬ 
structed  with  more  fuzziness;  i.e.,  the  template  had  to 


Example  values 


Cluster  1 


Cluster  2  Outlier 

W  •  M  • 


Length  (m) 


FIGURE  11.  Automatic  construction  of  a  fuzzy  predicate  for  the  attribute  LENGTH  of  the  part  node  PART  1  of  Figure  10(b). 
The  red  dots  at  the  top  of  the  figure  represent  the  population  of  length  values  from  all  PART  fs  in  the  example  set.  During 
the  construction  of  a  fuzzy  predicate,  outlier  (i.e.,  statistically  inconsistent)  values  are  discarded,  and  a  cluster  analysis  is 
performed  to  determine  the  number  of  clusters  that  might  best  explain  variances  in  the  remaining  values.  For  each  cluster, 
an  interval  of  maximum  returned  score  (1 .0)  is  then  established  between  the  minimum  and  maximum  lengths  of  that  cluster. 
Outside  this  interval,  the  fuzzy-predicate  curve  for  the  cluster  ramps  down  to  0.0  with  a  slope  that  is  proportional  to  the 
standard  deviation  o  of  the  cluster  population  multiplied  by  the  recognition  tolerance  fi  The  final  fuzzy  predicate  is  the 
maximum  of  the  individual  functions  generated  for  each  cluster. 


•  DFLANOY  U  Al  . 

ALu'hnh'  ffihlllgt'Ul  Autnuuiiu  kmr'ttitum  of  (  rllUiil  A/nA/A  m  l  Aii  RtuLir  >1 


10"'  10°  101  102 
False  detections/km2 


FIGURE  12.  Probability  of  detection  .  D  plotted  as  a  function  of  the  false-detectlon  rate 
for  the  four  target  detectors.  Each  point  along  any  of  the  curves  shown  is  associated 
with  a  particular  value  of  the  interest  (or  detection)  threshold. 


provide  a  greater  tolerance  for  variations  in  shape.  Hut 
the  most  difficult  to  represent  as  a  functional  tem¬ 
plate  was  the  empty  logging  truck,  because  of  the 
small  size  of  the  vehicles  trailer.  With  an  elongate 
shape  such  as  that  of  the  tank  truck,  small  uncor¬ 
rected  distortions  in  the  length  of  the  target  did  not 
have  a  serious  effect.  With  the  empty  logging  truck, 
however,  the  rear  axles  of  the  vehicle  are  the  only 
reliably  visible  part  of  the  trailer  and,  because  this 
portion  of  the  truck  is  short  relative  to  the  overall 
truck  length,  even  a  small  distortion  in  the  truck 
length  can  move  the  axles  ahead  or  behind  the  patch 
of  the  functional  template  representing  the  axles.  Con¬ 
sequently,  the  shape  and  appearance  of  the  empty 
logging  truck  could  not  be  defined  as  precisely  as  for 
the  other  targets.  Fortunately,  scoring  functions  can 
be  modified  easily  to  adjust  the  degree  of  tolerance  to 
variations  in  shape  and  appearance.  Although  the 
detection  rates  for  the  two  logging-truck  configura¬ 
tions  were  lower  for  a  given  FI)  rate  than  for  the  tank 
truck,  the  performance  was  still  respectable. 

for  a  better  understanding  of  the  sources  of  FDs, 
the  clutter  data  were  divided  into  natural  and  man¬ 
made  (cultural)  clutter.  Any  image  containing  a  large, 
man-made  object  (e.g.,  a  building,  non-target  ve¬ 
hicles,  or  stacks  of  logs)  was  placed  in  the  cultural- 
clutter  group.  Hach  of  the  target  detectors  was  then 
applied  to  both  divisions  of  the  data.  Figure  13  shows 


example  results  for  the  loaded  logging  truck.  The 
probability  density  was  computed  as  the  percent  of  all 
detections  found  within  each  successive  short  interval 
of  interest  scores  (0.016  in  the  range  from  ().()  to  1.0). 

Most  instances  of  natural  clutter  (mainly  trees  and 
shrubs)  tended  to  have  interest  scores  around  0.65, 
with  no  interest  scores  above  0.8.  The  population  of 
target  interest  scores  (shown  as  red  dots  at  the  top  of 
Figure  13)  had  scores  ranging  from  0.70  to  0.94. 
i  bus  the  detector  for  the  loaded  logging  truck  achieved 
a  perfect  partition  between  targets  and  natural  clutter 
(i.e.,  a  threshold  of  0.78  resulted  in  100%  detection 
with  no  FDs).  in  contrast,  man-made  objects  were  a 
more  troublesome  source  of  FDs  because  such  objects 
generated  a  few  interest  scores  that  were  as  high  as 
0.9.  Included  in  the  high-scoring  cultural  objects  were 
other  semi-trailer  trucks,  such  as  the  deployed  tank 
truck,  and  stacks  of  logs  similar  in  shape  to  the  loads 
carried  by  the  logging  trucks.  1  he  results  for  the  other 
target  detectors  were  roughly  the  same  as  that  for  the 
loaded  logging  truck  with  the  detectors  for  the  tank 
truck  providing  a  slightly  better  partit'on  between 
targets  and  clutter,  and  the  detector  for  the  empty 
logging  truck  a  slightly  worse  partition. 

It  can  be  argued  that  logging  trucks  provide  more 
stressful  testing  than  would  arise  from  an  actual  CM T 
application.  Because  missile  launchers  are  large  and 
nonarticulated,  their  detection  is  less  vulnerable  to 


175 


•  DhlANOY  FT  A1 . 

Maehine  Intelligent  lit h’onltn  Reu/gntlwn  of  i  nmol  Mobile  1  Meet'  in  l  .ner  H.itLtr  Imogen 


T  atgets 

t  •  •  mA  m  • 


FIGURE  13.  Distributions  of  interest  (or  detection)  scores  for  natural  and  man-made 
(cultural)  clutter  for  the  loaded-1  «gging-truck  target  detector.  The  red  dots  at  the  top  of 
the  figure  indicate  the  interest  scores  for  instances  of  the  deployed  loaded  logging 
trucks.  At  an  interest  (or  detection)  threshold  of  about  0.78,  the  loaded-logging-truck 
target  detector  achieves  perfect  discrimination  between  deployed  loaded  logging  trucks 
and  natural  clutter.  There  is  no  threshold,  however,  that  would  yield  a  perfect  discrimina¬ 
tion  between  these  targets  and  cultural  clutter. 


the  effects  of  distortion.  Also,  CMT's  have  only  three 
basic  shape  variations:  missile  down  for  transport, 
missile  erected  for  launch,  and  without  a  missile  fol¬ 
lowing  launch.  These  three  variations  have  precise- 
known  shapes,  in  contrast  to  the  amorphous  nature 
of  log  loads. 

( hchtsion  Experiments 

One  of  the  primary  motivations  for  the  development 
of  I  T( was  to  overcome  the  way  in  which  occlusion 
disrupted  the  more  traditional  shape-matching  tech¬ 
niques.  Various  attempts  to  design  an  MM  approach 
for  detection  and  extraction  failed  with  targets  that 
had  as  little  as  S(,<>  of  their  surface  areas  occluded 
by  foliage.  With  functional  templates,  however, 
all  targets  could  he  detected  readily  without  much 
c  hallenge. 

To  explore  the  limitations  of  TIC.  in  detecting 
targets  occluded  by  foliage,  we  designed  experiments 
in  which  targets  that  were  cut  out  from  one  image 
were  positioned  along  a  tree  line  within  another  im¬ 
age.  Beginning  at  locations  where  the  target  was  com¬ 
plete!  v  nnobscured.  the  target  was  increment, tllv  moved 
under  the  foliage,  with  the  vehicles  major  ( i.c..  longi¬ 


tudinal)  axis  either  perpendicular  or  parallel  to  the 
tree  line.  Figure  14  summarizes  the  results  of  the 
experiment  with  the  target  perpendicular  to  the  tree- 
line  for  target  occlusions  of  2%,  36°<i,  and  66°<i. 

The  left  frames  in  each  row  are  range  images  of  a 
tank  truck  that  has  been  synthetically  placed  perpen¬ 
dicular  to  the  tree  line.  The  center  frames  show  dis¬ 
location  and  orientation  of  the  best  match  tor  the 
tank-truck  functional  template  in  the  images,  along 
with  the  corresponding  interest  scores.  I  he  pixels 
themselves  indicate  the  scores  returned  from  indi¬ 
vidual  scoring  functions  for  each  location  on  the  tem¬ 
plate:  black,  white,  and  gray  pixels  represent  ().(),  1.0, 
and  intermediate  values,  respectively.  I  he  right  frames 
show  the  results  of  recognition  based  on  the  matching 
of  the  decomposed  target  with  AMs  of  the  targets.  It 
should  be  noted  that  occluded  targets  were  included 
in  the  example  set  used  to  build  the  AMs. 

The  first  row  of  Figure  14  shows  that  a  target  that 
is  almost  completelv  exposed  (onlv  2"n  occlusion) 
results  in  a  strong  interest  score  and  correct  recogni¬ 
tion.  Tor  a  target  occlusion  of  .Wn,  the  interest  score 
,s  barclv  .those  the  interest  threshold  of  I) but  the 
target  is  recognized  corrcctlv  nonetheless.  Tor  an  oc- 


FIGURE  14.  Summary  of  foliage  occlusion  experiment  for  a  tank  truck  that  has  been  syntheti¬ 
cally  positioned  perpendicular  to  a  tree  line.  The  top,  middle,  and  bottom  rows  are  for  the 
target  with  2°o,  36°o.  and  66°o,  respectively,  of  its  surface  area  occluded  by  foliage.  The  left 
frames  are  range  (height  above  ground)  images,  the  center  frames  indicate  the  locations 
and  orientations  corresponding  to  the  highest  interest  scores  (indicated  in  the  frames),  and 
the  right  frames  show  the  final  AM-based  recognition  results. 


I 


FIGURE  15.  Summary  of  foliage  occlusion  experiment  similar  to  that  of  Figure  14.  except  the 
tank  truck  has  been  positioned  parallel  to  the  tiee  line. 


•  DELANOY  E  l  AL. 

Machine  Intelligent  Automatic  Recognition  of  (  ntna  . labile  iarget>  in  Layer  Radar  Imagery 


elusion  of  66%,  the  interest  score  is  onlv  0.65,  which 
is  below  the  interest  threshold  of  0.75.  ( Consequently, 
this  target  is  not  detected  and  therefore  not  processed 
bv  the  matching  module.  (The  UNKNOWN  identi¬ 
fication  in  the  figure  is  the  result  of  our  dropping  the 
interest  threshold  to  below  0.63.)  And  yet,  as  shown 
in  the  center  frame,  the  best  match  produced  by  FTC 
correctly  determines  the  location  and  orientation  of 
the  target,  despite  the  target's  being  more  than  half 
occluded.  We  have  obtained  similar  results  for  the 
case  in  which  the  major  axis  of  the  tank  truck  is 
parallel  to  the  tree  line,  as  shown  in  Figure  1  5. 

For  the  combined  data  of  perpendicular  and  paral¬ 
lel  target  placements,  Figure  16  contains  a  plot  of 
interest  score  as  a  function  of  percent  occlusion.  The 
figure  shows  that  the  decrease  in  interest  score  as  a 
function  of  percent  occlusion  conforms  to  an  ex¬ 
pected  linear  relationship:  performance  degrades 
gradually  as  occlusion  increases,  without  any  intervals 
of  rapid  degradation.  At  an  interest  threshold  of  0.75, 
targets  occluded  up  to  around  36%  are  detected  and 
recognized.  Lowering  the  threshold  would  permit  the 
detection  and  recognition  of  targets  with  an  even 
higher  percent  of  occlusion,  but  would  also  increase 
the  FD  rate. 


0  20  40  60  80  100 


Percent  occlusion 

FIGURE  16.  Summary  of  results  for  the  experiments  de¬ 
scribed  in  Figures  14  and  15.  Note  that  at  the  detection 
interest  threshold  of  0.75,  the  system  is  able  to  detect  and 
recognize  targets  with  up  to  36%  of  their  surface  areas 
occluded  by  foliage. 


Overall  System  Performance 


Although  FTC  can  provide  high  detection  rates  with 
few  FDs,  the  fewer  FDs  the  better.  Because  F1C 
bases  its  interest  scores  solelv  on  how  well  image 
values  compare  to  expectations  at  different  locations 
on  the  kernel,  it  does  not  exploit  the  known  relation¬ 
ships  of  target  parts.  We  have  relied  on  the  technique 
of  AMs  as  a  means  of  modeling  such  additional  infor¬ 
mation  and,  by  so  doing,  have  provided  the  means  for 
rejecting  FDs  and  discriminating  between  multiple, 
similarly  shaped  target  classes. 

For  initial  testing,  we  used  an  interest  threshold  of 
0.75  to  detect  clusters  of  high  interest  values  in  the 
Maine  dataset.  A  total  of  402  detections  resulted, 
including  all  63  deplored  targets.  1  hesc  detected  tar¬ 
gets  were  then  extracted,  characterized,  and  matched 
against  AMs,  as  described  earlier. 

Initially,  when  we  built  the  AMs  we  used  5()"<>  of 
the  targets  as  examples  and  a  recognition  tolerance  f! 
of  0.3.  The  remaining  targets  that  were  classified  as 
UNKNOWN  (i.e..  insufficiently  like  any  modeled 


target)  were  subsequently  added  to  the  example  set  as 
we  refined  the  models.  Eventually,  80%  of  the  targets 
were  included  in  the  example  set  to  reach  a  recogni¬ 
tion  performance  of  1 00%.  No  supervised  learning  of 


weights  was  done  for  this  test.  Under  these  condi¬ 
tions,  there  were  no  FAs  in  all  17.13  knT  (2303 


image  pairs)  of  data. 

The  above  results  include  some  targets  deployed  in 
the  open,  but  they  also  include  a  number  of  very 
difficult  cases.  Figure  17  shows  photographs  and  a 
map  of  a  deployment  in  which  the  empty  logging 
truck  and  the  camouflaged  tank  truck  were  placed  on 
a  nairow  dirt  road  with  tall  trees  on  either  side.  For 


the  empty  logging  truck.  Figure  1  8  contains  the  range 
and  intensity  images,  an  interest  image  highlighting 
pixels  having  above-threshold  interest  values  and  show¬ 
ing  the  selected  windows,  and  an  image  showing  the 
final  recognition  results.  I  he  truck,  visible  in  the 
lower  left  corner,  has  been  correctlv  recognized.  An 
FD,  triggered  by  a  collection  of  shrubs  having  roughly 
the  size  and  spacing  of  the  parts  of  the  empty  logging 
truck,  was  correctlv  rejected  during  AM-based  match¬ 
ing.  Figure  1 shows  the  results  tor  the  camouflaged 
tank  truck  of  Figure  l"7.  Note  that  in  this  case  the 


178 


•  1H  IANOY  I  I  At  . 

M.uhm,  Aiitui'iiiiii  AV. txjiiiru'H  nf  (  i uu u!  \U>htL  l in  /  .hut  Uuil.it  Ituuu.n 


FIGURE  17.  Photographs  and  deployment  map  of  "hidden  targets"  used  in  the  Portage,  Maine,  experiments. 


target  is  represented  in  tile  range  image  mostlv  as  a 
broad  tent  with  onlv  a  lew  pxels  having  height  values 
corresponding  to  the  ground.  Because  the  shape  ol 
the  camouflage  netting  can  varv  from  site  to  site,  the 
scoring  functions  for  the  range  functional  template 
were  constructed  to  incorporate  considerable  uncer¬ 
tainty  and  were  thus  weakly  discriminating.  In  inten¬ 
sity  images,  however,  the  camouflage  netting  actually 
helps  make  the  target  stand  out  against  the  back¬ 
ground,  probably  because  of  interference  effects  caused 
by  some  of  the  laser  energy  being  reflected  bv  the 
netting  material  and  some  being  reflected  by  the 
ground  |1”’|.  I  he  scoring  function  for  the  intensity 
functional  template  for  the  camouflaged  tank  truck 
exploits  this  phenomenon. 

deucy, iliztitiou  of  AMs 

In  the  course  of  evaluating  the  dataset,  we  discovered 
that,  in  addition  to  the  one  logging  truck  that  Lin¬ 
coln  l  aboratory  personnel  had  deployed,  there  were 
six  other  empty  logging  trucks  in  the  vie i n i tv.  Onlv 
two  of  these  six  trucks  bad  interest  scores  greater  than 
().",  and  neither  of  the  two  was  recognized  as  an 
empty  logging  truck.  Iigure  20  shows  intensity  and 
range  images  containing  three  of  the  six  non --Lincoln 
Laboratory  trucks,  along  with  a  road-mobile  crane 
f top  center).  I  he  truck  in  the  image  that  had  an 
interest  score  above  threshold  was  classified  as  UN¬ 
KNOWN'.  Initially,  we  were  disappointed  with  this 
result  until  we  realized  that  the  discrimination  made 
bv  X  I  IxS  was  m  f.tc  t  reasonable  and  useful.  I  igure  2  1 
(left)  is  a  i.tngc  image  of  the  logging  truck  deployed 
bv  I  i  tie  ol  it  laboratory,  and  iigure  21  (right)  is  a 


range  image  of  one  of  the  six  other  trucks.  Note  that 
although  both  vehicles  serve  the  same  function  and 
are  called  logging  trucks,  their  appearances  are  in  fact 
distinct.  I  he  truck  deployed  bv  Lincoln  Laboratory 
has  a  tractor  with  the  cab  directly  oxer  the  engine, 
while  the  other  vehicle  has  a  hooded  tractor  with  the 
cab  behind  the  engine.  Also,  although  the  overall 
lengths  of  the  trailers  are  the  same,  the  trailer  of  the 
Lincoln  l  aboratory  truck  is  narrower  and  lighter  in 
appearance.  If  the  two  vehicles  are  considered  as  two 
distinct  objects,  then  XTRS  did  successfully  discrimi¬ 
nate  between  the  two  variants  w  ith  1  ()()'!«  accuracy. 

Suppose,  however,  that  AMs  we  built  using  ex¬ 
amples  of  one  model  within  a  target  class  were  to  be 
used  to  recognize  a  more  general  class  of  targets, 
including  other  models  not  represented  in  the  ex¬ 
ample  set.  All  six  of  the  trucks  not  deployed  by 
Lincoln  Laboratory  were  detected  bv  decreasing  the 
interest  threshold  from  0.75  to  0N2.  And  all  six 
vehicles  were  recognized  as  empty  logging  trucks  by 
increasing  the  recognition  tolerance  />  from  0..1  to 
1.7.  Lhus,  bv  using  onlv  two  tunable  parameters,  we 
could  adjust  the  generality  of  the  recognition  tor  the 
entire  system.  But  the  relaxation  needed  to  generalize 
the  recognition  had  an  associated  cost:  the  LA  rate  for 
the  system  as  a  whole  increased  from  0.0  to  IN 
I  A/km".  Of  course,  instead  of  generalizing  the  AMs 
to  include  similar  related  targets,  we  could  have  con¬ 
structed  additional  functional  templates  and  AMs. 


Stiperi'/ser/  I  hscniniihition  I  e,inii> ig  <>! \hnicl  Weights 

A  common  criticism  of  mam  rescan  h  A  I  R  systems  is 
that,  because  ol  the  limited  availability  of  data,  the 


FIGURE  18.  Detection  and  recognition  results  for  the  hidden  empty  logging  truck  of  Figure 
1 7.  The  truck  is  visible  in  the  lower  left  corner  of  the  images.  A  false  detection,  triggered  by 
a  collection  of  shrubs,  has  been  correctly  rejected  as  UNKNOWN. 


•'tf 


FIGURE  19.  Detection  and  me  ogmtion  insults  foi  the  hidden  camouflaged  tank  tuirk  of 
Figuie  1  7. 


f  ‘  s'  INjUr-  .. 


INTEREST 


*\  V 


& 1 


** 


FIGURE  20.  Detection  and  recognition  insults  tor  three  empty  logging  trucks  that  were  not 
part  of  the  test  deployment.  (Note:  The  images  also  contain  other  vehicles,  including  pi<  kup 
tan  ks  and  a  road-mohile  crane,  at  the  top  renter  Also  note  that  in  the  lange  image  the 
Iniildmg  in  the  uppei  left  coinei  has  iih  oiiim  t  height  values  due  to  an  aitifac  t  in  estimating 
the  height  of  local  ground  foi  large  objects.)  Two  of  the  tluee  tanks  received  below 
threshold  mteiest  scores  and  were  thus  not  detected.  The  remaining  tiuck  had  an  mteiesi 
score  above  threshold  but  was  classified  as  UNKNOWN.  The  system  failed  to  deter  t  and 
recognize  the  three'  trucks  because  of  diffeiences  between  them  and  the  logging  tank 
deployed  by  Lincoln  Laboiatory  (see  Figure  21). 


.-4  . 


.TW  ./ .  - 

•  :  vV  -  '•  -5 

Vi-  • 

%  .A^?  4  f  ^ 

.  .  ’  ; 


-jj 

-ALfTTA.,?:  •  :sfi 


FIGURE  21.  Fnlaiged  i.mge  images  foi  (left)  empty  logiiing  tank  deployed  hv  l  mi  aln 
L.ihoi.iti ii y.  .ind  (light)  empty  li>(|i|ing  tan  k  ilisr  oveied  in  the  image  dataset.  Althouuli  both 
vi ‘I'ii  les  are  i  ailed  lo(|gmg  tan  ks,  then  uppeuiain  es  aie  in  l.n  t  distim  t.  1  lie  cab  tin  the 
tan  k  <  m  tile  left  is  (In  ei  tly  oyer  the  elU|ine.  while  the  i  ah  fill  the  other  whii  le  is  belli  ml  tin 
engine  Also,  although  the  1 1 „er all  lengths  < ,f  the  ti ailei s  for  the  two  ti m  ks  are  !he  same,  tin 
tiailei  foi  the  tan  k  on  the  left  is  nan  o  w  ei  and  lighte.i  iii  appear  am 


•  DELANOY  E  l  At. 

Machine  Intelligent  Automatic  Recognition  of  (  ritical  Mobile  I  arget>  in  layer  Radar  hunger) 


dataset  used  to  train  a  system  is  the  same  set  used  tor 
testing,  lb  a  certain  extent,  we  addressed  this  criti¬ 
cism  by  dividing  our  available  data  into  rwo  sets,  one 
tor  training,  the  other  tor  testing.  Because  at  most 
only  5()0(>  ot  the  targets  were  to  be  used  in  the  ex¬ 
ample  training  set,  we  assumed  that  a  large  recogni¬ 
tion  tolerance  />  would  be  required  to  achieve  a  /’[,  ot 
1.0.  With  a  large  recognition  tolerance,  we  also  ex¬ 
pected  that  the  FA  rate  might  be  high.  Consequently, 
the  supervised  learning  ot  weights  was  used  to  sup¬ 
press  the  FAs. 

For  training,  we  built  the  AMs  with  a  high  /i  value 
ot  5.0  and  an  example  set  consisting  ot  1 165  image- 
pairs  (range  and  intensity)  containing  28  targets.  AM 
weights  were  all  initialized  to  0. 1 .  To  establish  a  baseline 
FA  rate,  we  did  not  use  supervised  discrimination 
learning  to  process  the  training  data.  The  high  /i  ot 
5.0  and  a  low  interest  threshold  ot  0.72  were  selected 
so  that  enough  FAs  would  be  generated  to  promote 
opportunities  tor  learning.  The  number  of  FAs  under 
these  conditions  was  37  (4.3  FA/km").  Next,  super¬ 
vised  discrimination  learning  was  initiated  and,  with 
each  complete  pass  through  the  training  data,  the 
nun, iser  ot  FAs  generated  during  that  pass  was  re¬ 
corded.  Figure  22  shows  the  results  ot  14  passes 
through  the  training  data.  Note  that  the  number  of 
FAs  dropped  from  37  to  21  during  the  first  pass  and 
stabilized  to  an  average  of  19  FAs  (2.2  FA/km")  by 
the  fourth  pass. 

After  the  completion  of  training,  testing  was  done 
on  27  targets  in  1 1 34  image  pairs  covering  8.4  km"  ot 
ground  area.  Using  the  weights  learned  from  training, 


40 

(J) 

_co 

30 

CO 

<L> 

_c/> 

CO 

20 

o 

a> 

-Q 

10 

D 

Z 

0  2  4  6  8  10  12  14 


Number  of  passes  through  training  data 

FIGURE  22.  Learning  curve  showing  the  decrease  in  false 
alarms  with  training. 


we  built  the  AMs  at  progressively  larger  values  ot 
I i  ranging  from  0.1  to  7.0.  For  each  />  value,  the 
probability  of  correct  recognition  /,(>  was  plotted  as  a 
function  ot  the  FA  rate.  Figure  23  shows  the  results. 
At  the  value  ot  />’  in  which  the  first  FA  occurs,  the  /  R 
was  0.44.  For  a  PR  ot  0.93,  the  FA  rate  was  1.92 
FA/km".  Two  of  the  tamers  were  not  recognized 
( i.e. ,  classification  UNKNOWN)  even  with  /i  set 
to  7.0.  At  some  higher  value  of  />’,  we  do  expect  to 
achieve  a  /R  ot  1.0,  but  we  did  not  attempt  to  find 
that  particular  />’  value.  In  this  test,  as  well  as  in  the 
previously  described  tests,  no  targets  were  mislabeled 
as  another  target  identity. 

Out  ot  a  total  ot  63  targets,  only  55  were  used:  28 
tor  training  and  27  for  testing.  The  reason  tor  this 
intentional  omission  of  eight  targets  was  that  there 
were  only  four  images  of  the  tank  truck  in  the  open 
and  four  images  ot  an  empty  logging  truck  with  the 
trailer  completely  occluded  by  the  frame  boundary  ot 
the  image.  With  only  two  examples  ol  a  target  tor 
training,  the  resulting  AMs  were  too  restrictive  to 
recognize  any  targets  other  than  the  two  training 
examples.  This  finding  highlights  how  the  building  ot 
robust  AMs  depends  on  the  proper  selection  of  a 
training  set.  As  with  any  learning  system,  a  realistic 
and  representative  sampling  of  variations  of  object 
appearance  is  necessary  to  achieve  robust  performance. 

Conclusions 

I  he  Experimental  Target  Recognition  System  (X  I  RS) 
provides  a  framework  for  applying  machine  intelli¬ 
gence  (Ml)  techniques  to  the  task  of  automatic  target 
recognition  (AI  R).  Based  largely  on  aspects  ot  fuzzy 
set  theory,  these  Ml  techniques  enable  the  representa¬ 
tion  ot  uncertainties  and  known  variabilities  in  target 
appearance. 

With  rule-based  experts  and  libraries  of  functions 
and  data  structures,  X  1  RS  can  be  organized  to  adapt 
automatically  to  environmental  context  and  to 
reconfigure  the  search  for  alternative  targets.  Using 
multiple  target  detectors,  X  I  RS  can  look  simulta¬ 
neously  tor  different  variations  In  target  shape.  1  he 
outputs  ot  all  target  detectors  are  expressed  as  interest 
images,  permitting  the  fusion  ol  all  sources  ot  evi¬ 
dence  into  a  single  spatial  map.  Despite  the  apparent 
complexity  ot  X  1  RS,  system  performance  can  be  con- 


182 


•  DKLANOY  tT  Al. 

Mttehine  Intelligent  Autnnutin  Recognition  »/  (  ritual  Mobile  Liraels  m  laser  Ha  til)  Imagery 


FIGURE  23.  Probability  of  correct  recognition  PR  as  a  func¬ 
tion  of  the  false-alarm  rate.  Each  point  along  the  curve 
corresponds  to  a  particular  value  of  the  recognition  toler¬ 
ance  parameter  /i. 

trolled  effectively  with  only  two  tunable  parameters: 
the  interest  threshold  for  controlling  the  output  of 
low-level  detection  and  the  recognition  tolerance  for 
controlling  the  output  of  high-level  matching  based 
on  appearance  models  (AM). 

XTRS  uses  AMs  to  model  how  targets  and  their 
constituent  parts  appear  in  sensor  imagery,  thus  pro¬ 
viding  an  alternative  to  other  classifiers,  including 
those  based  on  neural  network,  statistical,  and  other 
model-based  approaches.  Unlike  other  model-based 
approaches  that  encode  the  three-dimensional  struc¬ 
ture  of  an  object,  AMs  define  the  observable  appear¬ 
ance  of  targets  in  specific  sensor  data  within  the  con¬ 
straints  of  the  likely  target  orientations.  AMs  provide 
a  more  controllable  representation  than  neural  net¬ 
works.  Because  knowledge  is  represented  in  neural 
networks  as  a  diffuse  population  of  weights,  it  is 
difficult  to  identify  which  image  features  are  being 
used.  Not  only  are  the  attributes  and  weights  of  AMs 
easy  to  interpret,  they  can  be  modified  by  a  user  with 
predictable  effects  on  recognition  performance.  Neu¬ 
ral  networks  have  gained  in  popularity  as  classifiers 
principally  because  of  their  ability  to  learn  and  en¬ 
code  discriminants  automatically.  As  we  have  shown 
in  this  article,  the  automatic  learning  of  class  dis¬ 
criminants  is  also  possible  with  AMs,  but  in  a  repre¬ 
sentation  that  is  more  amenable  to  understanding 
and  selective  editing. 

Other  techniques  developed  for  XTRS  embody 
what  we  call  low-level  MI.  Most  existing  MI  tech¬ 
niques  used  in  computer  vision  rely  on  a  preliminary 
abstraction  of  raw  data  into  a  symbolic  form.  But  the 


process  of  abstraction  necessarily  reduces  the  amount 
of  information  available  tor  decision  making,  thus 
handicapping  an  observer,  no  matter  how  intelligent, 
fools  for  knowledge-based  signal  processing  and  pixel- 
level  accumulation  of  evidence  provide  the  intelligent 
means  of  using  object-  and  context-dependent  knowl¬ 
edge  to  guide  the  extraction  of  information  directly 
from  raw  image  data  without  the  need  tor  abstrac¬ 
tion.  In  particular,  functional  template  correlation 
(FTC)  allows  the  construction  of  generalized  matched 
filters  that  encode  knowledge  of  the  physics  of  a 
detection  problem.  Customized  operations  constructed 
with  FTC  are  generally  more  powerful  ( i . e . ,  more 
discriminating)  than  comparable  traditional  signal 
processing  operations.  In  AI  R  versions  of  XTRS,  we 
have  used  FTC'  as  a  one-step  three-dimensional  target 
recognizer.  For  other  applications,  we  have  developed 
knowledge-based  fuzzy  variations  of  standard  image 
processing  operations,  including  thin-line  detection, 
smoothing  operations,  basic  mathematical  morphol¬ 
ogy  (MM)  operations,  and  pattern  matching. 

The  need  for  FTC  arose  from  a  perceived  inad¬ 
equacy  of  the  standard  techniques  of  shape  analysis. 
Although  MM  worked  very  well  for  unobscured  tar¬ 
gets,  we  could  not  devise  a  sequence  of  MM  opera¬ 
tions  that  would  reliably  detect  and  extract  targets  in 
high-clutter  environments,  especially  when  the  target 
was  partially  occluded.  We  believe  that  our  failure  was 
due  in  part  to  the  all-or-nothing  nature  of  MM  op¬ 
erations  [18],  We  have  also  investigated  the  use  of 
normalized  cross-correlation,  the  other  commonly  used 
tool  for  shape  analysis.  In  its  favor,  cross-con  elation 
does  provide  a  variable  degree  of  match  that  can  be 
translated  easily  to  interest  values.  But  the  matches 
generated  by  cross-correlation  are  too  literal  in  that 
the  interest  scores  are  based  on  very  specific,  inflexible 
patterns  of  image  values. 

I  he  repetitive  evaluation  of  all  scoring  functions  in 
a  functional  template — for  all  orientations  for  each 
pixel  location — sounds  computationally  prohibitive. 
But  the  process  becomes  feasib'  ’  when  the  input  im¬ 
age  values  are  scaled  to  some  integer  range  (e.g.,  0  to 
255)  and  the  scoring  functions  are  implemented  as  a 
precomputed  two-dimensional  lookup  table  that  is 
indexed  by  the  scoring-function  numbers  and  the 
integer  image  values.  The  use  of  such  a  lookup  table  is 


•  DELANOY  E  l  AL. 

Mat  Unit-  Intelligent  Automatic  Recognition  of  (.ritual  Mobile  Iargets  m  Later  Radar  Imagery 


Table  1.  Performance  of  Prototype  System 


Experiment 


1 

2 

3 

Training 

Testing 

Number  of  targets 

63 

69* 

28 

27 

Ground  area  (km2) 

17.13 

17.13 

8.6 

8.4 

Detection  threshold 

0.75 

0.72 

0.72 

0.75 

Number  of  detections 

492 

1173 

656 

342 

Number  of  targets  in 

50 

50 

28 

— 

AM  example  set 

Recognition  tolerance 

0.3 

1.7 

5.0 

4.5 

%  correct  recognition 

100% 

100% 

100% 

93% 

FAs/km2 

0.0 

1.7 

2.2** 

1.4 

*  Includes  six  empty  logging  trucks  not  intentionally  deployed 
**  After  discrimination  learning 


generally  his  ter  than  multiplication,  making  FTC 
evaluation  quicker  than  cross-correlation. 

Low-level  Ml  also  allows  XTRS  to  delay  the  appli¬ 
cation  of  thresholds.  Instead  of  applying  thresholds 
either  to  a  single  image  consisting  of  raw  data  or  to 
the  output  of  some  simple  transformation  of  the  raw 
data,  we  can  apply  the  thresholds  to  maps  of  interest 
containing  evidence  that  has  been  extracted  from  a 
variety  of  sources. 

Unlike  the  AMs,  the  FTC-based  target  detectors 
were  constructed  and  tuned  manually.  The  develop¬ 
ment  of  a  useful,  operational  ATR  system  that  is  able 
to  adapt  swiftly  to  different  targets  and  mission  sce¬ 
narios  requires  a  mechanism  for  constructing  func¬ 
tional  templates  automatically.  We  have  developed 
methods  for  building  functional  templates  from  sta¬ 
tistics  accumulated  from  example  targets,  but  these 
methods  have  not  yet  been  implemented.  Functional 
templates  might  also  be  constructed  bv  using  the 
emerging  techniques  of  genetic  programming. 

The  success  of  our  approach  to  ATR  is  indicated 
by  the  overall  system  performance  of  the  prototype 
system,  as  summarized  in  Table  1 .  In  experiment  1 ,  in 
which  we  used  strict  tolerances  for  the  automatic 
construction  of  the  target  AMs,  we  were  able  to  achieve 


100%  correct  target  recognition  in  the  available  data 
with  no  mislabelings  and  no  false  alarms.  It  is  impor¬ 
tant  to  note  that  the  AMs  are  flexible  and  can  be 
generalized  to  broader  classes  of  vehicles  by  the  ma¬ 
nipulation  of  a  single  recognition  tolerance.  Experi¬ 
ment  2  demonstrates  this  flexibility  and,  in  particular, 
the  capability  for  generalization  by  increasing  the  rec¬ 
ognition  tolerances.  Six  empty  logging  trucks  were 
found  in  the  dataset  that  were  somewhat  different 
from  the  one  logging  truck  that  was  intentionally 
deployed.  These  six  were  appropriately  rejected  as 
clutter  in  experiment  1.  Suppose,  however,  that  the 
additional  six  trucks  were  to  be  included  in  a  broader 
class  of  empty  logging  trucks.  By  changing  just  the 
recognition  tolerance  from  0.3  to  1.7  in  experiment 
2,  the  system  was  able  to  recognize  the  six  trucks  as 
empty  logging  trucks.  Of  course,  the  cost  of  general¬ 
izing  all  models  in  this  manner  was  that  the  FA  rate 
increased  from  0.0  to  1.7  FA/knT.  Experiment  3 
shows  that  AMs  constructed  from  more  limited  train¬ 
ing  sets  can  be  used  to  recognize  targets  with  reason¬ 
able  reliability  in  a  separate  test  set.  The  training  sets 
were  limited  in  size  and  did  not  provide  a  good  repre¬ 
sentative  sampling  of  vehicle  appearances.  Conse¬ 
quently.  AMs  were  constructed  with  large  recognition 


•  DELANOY  ET  AL. 

Machine  Intelligent  Automatic  Recognition  of  Critical  Mobile  I argets  m  Laser  RiuLir  Imagery 


tolerances  in  order  to  achieve  high  detection  rates. 
The  resulting  elevated  false-alarm  rate  was  suppressed 
by  roughly  50%  through  the  use  ol  supervised  dis¬ 
crimination  learning.  Despite  these  limitations,  rea¬ 
sonably  good  performance  was  evident  in  the  separate 
test  dataset. 

In  contrast,  without  the  techniques  of  low-level 
Ml  and  the  automatic  construction  of  complex  AMs, 
we  were  unable  to  construct  an  ATR  system  for  this 
application  anywhere  near  as  accurate,  flexible,  or 
robust  as  the  one  described  in  this  article  [181.  Of 
course,  some  credit  for  the  performance  of  the  system 
must  go  to  the  quality  of  the  sensor  images  used.  But 
images  of  good  quality  do  not  necessarily  guarantee 
reliable  detection  performance.  Even  with  an  image 
of  excellent  quality,  concealment  and  clutter  can  make 
target  detection  a  challenging  problem. 

So  far,  XTRS  has  been  applied  to  two  other  ATR 
problems:  the  recognition  of  armored  vehicles  both  in 
forward-looking  laser  radar  images  [8]  and  in  fully 
polarimetric  synthetic-aperture  radar  images  [9].  But 
XTRS  provides  the  means  of  solving  a  more  general 
class  of  object-detection  problems.  In  addition  to  its 
use  in  recognizing  military  targets,  XTRS  has  been 
applied  successfully  to  the  task  of  detecting  and  track¬ 
ing  hazardous  weather  phenomena  in  Doppler 
weather  radars  [  1 1 ). 

Acknowledgments 

T  he  authors  thank  their  colleagues  in  the  Opto- Radar 
System  Croup  at  Lincoln  Laboratory  for  providing 
the  extensive  laser  radar  dataset  used  in  this  research. 
This  work  was  sponsored  by  the  Defense  Advanced 
Research  Projects  Agency  (DARPA). 


REFERENCES 


! .  J.G.  Verly  and  R.L.  Delanoy.  ” Appear anc.-Model- Based  Rep¬ 
resentation  and  Matching  of  3-1)  Objects,"  Proc.  ird  Inti. 
Conf.  on  Computer  Vision  (Osaka,  Japan,  4  "T  Dec.  1  *>00), 
p.  248. 

2.  J.G.  Verb’,  B.  Williams,  and  R.L  Delanoy,  “Model-Based 
Pattern  Recognition,"  U.S.  Patent  No.  3,123,037  (June  1442). 

3.  J.t,.  Vcrlv,  B.  Williams,  and  R.L.  Dclanov,  private  com¬ 
munication. 

4.  R.L.  Delanoy,  J.ti.  Vcrly,  and  1)  L.  Dudgeon.  "Automatic 
Building  and  Supervised  Discrimination  Learning  ol  Appear¬ 
ance  Models  ol  3-D  Objects.”  SPIT  1708,  344  (1042). 

5.  R.L.  Delanoy.  J.G.  Vcrly.  and  D.F.  Dudgeon,  “f  unctional 
Templates  and  Their  Application  to  3-D  Object  Recogni¬ 
tion,  l' nn\  Inti.  Conf.  on  Acoustics,  Speech,  and  Signal  Process¬ 
ing  (ICASSP).  San  Francisco,  23-26  Mar.  I  ‘192.  p.  111-141. 

6.  R.L.  Delanoy  and  J.G.  Vcrly,  “Computer  Apparatus  and 
Method  lor  Fuzzy  Template  Shape  Matching  I'sing  a  Scoring 
Function,”  U.S.  Patent  No.  3,222,133  (June  1443). 

7.  R.L.  Delanoy,  J.G.  Vcrly.  and  D.F.  Dudgeon.  "Pixel-level 
Fusion  Using  Interest  Images."  Technical  Report  9"9,  Mi  l 
Lincoln  laboratory  (2b  Apr.  1443). 

8.  J.G.  Verly,  R.L.  Delanoy,  and  D.F.  Dudgeon,  “Machine  In¬ 
telligence  Technology  lor  Automatic  Target  Recognition, 
line.  lab.  /.  2,  277(1484). 

4.  J.G.  Verly,  R.L.  Delanoy,  and  G.  Lizott,  “Principles  and 
Lvaluation  of  an  Automatic  Target  Recognition  System  for 
Synthetic  Aperture  Imagery'  Based  on  the  Use  of  Functional 
Templates,"  SP/F  I960  (1443),  to  be  published. 

10.  J.G.  Verly.  R.L.  Delanoy,  and  D.F.  Dudgeon,  “Model-Based 
System  for  Automatic  Target  Recognition  from  Forward-look¬ 
ing  laser- Radar  Imagery,"  Opt.  Fug.  31,  2340  ( 1442). 

11.  R.L.  Delanoy  and  S.W.  Troxel,  “Machine  Intelligent  Gust 
Front  Detection,  in  this  issue. 

12.  L.M.  Novak,  M.C.  Burl,  R.l).  Chaney,  and  G.J.  Owirka, 
“Optimal  Processing  ol  Polarimetric  Synthetic-Aperture  Ra¬ 
dar  Imagery,"  Line.  I.ab.  J.  3,  273  (1440). 

13.  J.  Serra.  Image  Analysis  and  Mathematical  Morphology  (Aca¬ 
demic  Press,  New  York,  1482). 

14.  T.R.  Fsselman  and  |.(  i.  Verly,  “Applications  of  Mathematical 
Morphology  to  Range  Imagery,"  technical  Report  I  R-~9~. 
Ml  I  Lincoln  Laboratory  (Dec.  1487),  [)  |  |C  - AI )- 1 843 1  b. 

13.  1  ,R  Fsselman  and  J.  G.  Verlv.  “Some  Applications  of  Mathe¬ 

matical  Morphology  to  Range  Imagery,"  Proc.  Inti.  Conf.  on 
Acoustics,  Speech,  and  Signal  Processing  (ICASSP)  1,  Dallas  -\pr. 
6-9  1987,  p.  243. 

lb.  T.R.  Fsselman  and  J  <  •  Vcrly.  “Feature  [extraction  from  Range 
Imagery'  Using  Mathematical  Morphologv."  S/’/F  8 43,  233 
(1487).' 

17.  Private  communication. 

1 8.  Private  communication. 


VStS  ■  :■'!  .  A.  (  , 


vnmvi  f 


185 


•  DUANOY  H  Al.. 

Machine  Intelligent  Autonutth  Remgwtion  of  (  ritual  Mobile  targets  in  l a  hi  Radar  Imagery 


h 


HU  II  AKI)  I  .  1)1  |  ANOY 

is  a  suit  member  nl  the  Ma¬ 
chine  Intelligence  let Imolngv 
( 1 mup.  I  lis  work  stuns  the 
fields  ol  computer  vision, 
nuchine  learning.  .itul  construe 
lion  ol  object  recognition 
systems.  I  lorn  P>8()  to  P)8A,  he 
was  a  research  scientist  at  the 
l  'niversitv  ol  Virginia  Depart 
ri rc'M c  ol  Psychology,  where  he 
investigated  the  biochemical 
correlates  ol  learning  and  the 
effects  of  stress  related  hor¬ 
mones  on  elec  I  rophvsiologk.il 
models  ol  memory.  Before 
joining  1  incoln  Laboratory  in 
I  ‘>8”,  lie  worked  lor  (  d  I  antic 
Automation  N.A..  Inc.,  as  a 
software  engineer  developing 
numerical  and  programmable 
controllers  for  manulac luring 
automation.  I  )ick  received  a 
B.A.  degree  in  biology  from 
Wake  I  dlest  l  niversitv  in 
PJ7.L  a  Ph.l).  degree  in  neuro¬ 
science  from  tlie  l ’niversitv  of 
Honda  ( College  ol  Medicine  in 
PD‘J.  and  an  M  S.  degree  in 
computer  science  Irom  the 
l  niversitv  o|  Virginia  in  l‘)8*?. 

I  le  was  a  National  Sc  ience 
foundation  Predoc  ror.il  fellow 
and  a  National  Institute  ol 
Mental  Health  Postdoc  total 
bellow. 


|A<  Qltl-S  VI  HI  V 
received  the  Ingcnicut 
l  lectronicien  degree  Irom  the 
l 'niversitv  ol  l  iege,  Belgium, 
in  P)7S.  I  hrough  a  sponsor 
shipol  the  Belgian  American 
( ciucational  foundation,  he 
came  to  the  l  'nited  States  that 
year  and  received  an  M  S. 
degree  and  a  Pit.  I  ).  degree  m 
electrical  engineering  from 
Stanford  l 'niversitv  in  P>~7i 
and  P)80,  respectively.  At 
Stanford  he  performed  doctoral 
research  in  the  fields  of  image 
reconstruction  and  restoration. 

[  'poll  graduation,  he  joined 
Lincoln  I  aboratorv,  where  he 
has  worked  on  (among  other 
things;  computer  vision  prob¬ 
lems  associated  with  laser  radar 
data  and.  more  recently,  with 
fully  pnlarimetric  synthetic 
aperture  radar  ISAK)  imagery. 

I  le  has  over  40  publications  in 
the  areas  of  image  reconst  rue 
lion  Irom  projections 
(tomography),  image  process¬ 
ing.  optical  signal  processing, 
distributed  signal  processing, 
mathematical  morphology, 
computer  vision,  and  auto¬ 
matic  target  recognition. 

|accjues  is  the  coholder  ol  two 
l  .S.  patents  in  the  area  of 
model-based  object  recogni¬ 
tion.  and  is  a  fellow  of  the 
Belgian  American  I  ciucational 
Inundation. 


dan  i  .  i>t' dc. ion 
is  a  senior  stall  member  in  the 
Machine  Intelligence  leclmol 
ogy  ( iioup,  where  Ins  loc  us  of 
research  has  been  in  automatic 
target  recognition.  Before 
joining  i  incoln  I  aboratorv  in 
PD1).  he  worked  lot  Bolt. 
Ber.mek.  and  Newman,  Inc.,  it 
(  ambridge.  Massac liuse 1 1 s. 

I  hn  received  I )>c*  following 
degrees  from  Ml  1 :  an  S.B.  in 
electrical  science  and  engineer 
ing.  and  an  S.M..  an  l  .l  .  and 
aSc.D.  in  signal  processing.  He 
was  the  coiecipient  ol  the  PP'(> 

II  I  I  Browder  |.  I  hompson 
Pri/e,  and  is  the  coanthoi  of 
two  books:  Multidimensional 
/  hgtial  Signal  Rnn  ess  rug 
(Prentice  Hall.  1  nglewood 

f  litis.  New  |erse\.  PJX-»)  and 
Array  Signal  Rnness/ng 
(Prentice  Hall,  PJ‘H).  Because 
ol  bis  contribution  in  the  field 
ol  multidimensional  signal 
processing,  i  ).m  was  elected  an 
1 1  1  I  1  eliow  in  1  ‘)8  ’.  I  le  is 
also  a  fellow  of  the  National 
Sc  ience  foundation. 


Machine  Intelligent 
Gust  Front  Detection 

Richard  L.  Delanoy  and  Seth  W.  Troxel 

■  Techniques  of  low-level  machine  intelligence,  originally  developed  at  Lincoln 
Laboratory  to  recognize  military  ground  vehicles  obscured  by  camouflage  and 
foliage,  are  being  used  to  detect  gust  fronts  in  Doppler  weather  radar  imagery. 
This  Machine  Intelligent  Gust  Front  Algorithm  (MIGFA)  is  part  of  a  suite  of 
hazardous-weather-detection  functions  being  developed  under  contract  with  the 
Federal  Aviation  Administration.  Initially  developed  for  use  with  the  latest- 
generation  Airport  Surveillance  Radar  equipped  with  a  wind  shear  processor 
(ASR-9  WSP),  MIGFA  was  deployed  for  operational  testing  in  Orlando, 
Florida,  during  the  summer  of  1992.  MIGFA  has  demonstrated  levels  of 
detection  performance  that  have  not  only  markedly  exceeded  the  capabilities 
of  existing  gust  front  algorithms,  but  are  competitive  with  human 
interpreters. 


Gust  fronts  generated  by  thunderstorms  can 
seriously  affect  the  safety  and  efficiency  of 
airport  operations.  Lincoln  Laboratory,  un¬ 
der  contract  with  the  Federal  Aviation  Administra¬ 
tion  (FAA),  has  had  a  significant  role  in  the  develop¬ 
ment  of  two  Doppler  radar  systems  that  are  capable 
of  detecting  low-altitude  wind  shears,  including  gust 
fronts,  in  the  airport  terminal  control  area.  These 
systems  are  the  Terminal  Doppler  Weather  Radar 
(TDWR)  and  the  latest-generation  Airport  Surveil¬ 
lance  Radar  enhanced  with  a  wind  shear  processor 
(ASR-9  WSP). 

By  examining  images  generated  by  these  radars, 
experienced  human  observers  can  reliably  detect  and 
track  gust  fronts.  But  the  development  of  automated 
gust  front  detection  algorithms  having  sufficiently 
high  detection  rates  with  few  false  alarms  has  been 
elusive.  The  gap  between  human  and  computer  per¬ 
formance  is  due  to  several  limitations  of  the  detection 
algorithms.  These  limitations  include  the  lack  of  means 
for  handling  and  maintaining  weak,  ambiguous,  and 
contradictory  evidence,  the  use  of  multiple  sequen¬ 
tially  applied  thresholds  for  object  discrimination  (such 
thresholds  can  inadvertently  result  in  the  discarding 


of  important  data),  a  failure  to  use  all  of  the  relevant 
information  available  in  the  input  data,  and  the  in¬ 
effective  use  of  knowledge  regarding  the  behavior 
or  appearance  of  gust  fronts  under  different 
circumstances. 

Given  clear,  unambiguous  radar  gust  front  signa¬ 
tures,  existing  detection  algorithms  perform  reason¬ 
ably  well.  The  challenge  is  in  constructing  algorithms 
that  can  handle  the  marginally  detectable  ambiguous 
cases.  In  such  cases,  various  factors  must  be  consid¬ 
ered.  For  example,  gust  fronts  can  be  obscured  by 
large  areas  of  precipitation,  or  gust  front  signatures 
can  disappear  in  Doppler  velocity  images  whenever 
the  Doppler  viewing  angle  is  perpendicular  to  the 
direction  of  motion.  Furthermore,  gust  fronts  can  be 
mimicked  by  other  natural  phenomena,  such  as  flocks 
of  birds,  clouds  of  dust  stirred  up  at  construction 
sites,  low-intensity'  rain,  and  ground  clutter.  And  gust 
fronts  can  have  very  low  radar  cross-section  densities, 
sometimes  below  the  sensitivity  of  the  radar  system. 

The  preceding  paragraph  should  sound  fa  miliar  to 
those  involved  in  the  development  of  automatic  tar¬ 
get  recognition  (ATR)  systems,  for  the  issues  are  basi¬ 
cally  the  same.  In  addition  to  the  continual  trade-off 

vi)i  ;;'/f  fi  N  j r/ b f  R  *  'w1.  !»f  ■  *isr.ni  N :  aboraior.  uOttRSS: 


187 


•  DfcLANOY  AND  I  ROXtl. 

Mdih me  Intelligent  du>t  trout  Ihtmton 


t 

Downdraft 


FIGURE  1.  Thunderstorm  downdraft  and  resulting  gust 
front.  The  cool  outflow  beneath  a  thunderstorm  spreads 
out  in  all  directions.  The  leading  edge,  where  the  cool 
outflow  and  the  warmer  ambient  air  converge,  is  called 
the  gust  front. 

between  detection  rates  and  numbers  of  false  alarms, 
the  issues  for  gust  front  detection  are 

1.  obscuration  and  camouflage, 

2.  sensor  limitations, 

3.  clutter  and  decoys,  and 

4.  stealth. 

Not  surprisingly,  the  overall  design  of  existing  gust 
front  detection  algorithms  is  similar  to  that  of  most 
A  I  R  systems.  I  his  traditional  design  is  characterized 
by  a  hierarchy  of  modules,  typically  called  detection, 
extraction  (or  discrimination),  and  classification.  The 
detection  process  is  essentially  the  application  of  some 
threshold  that  has  been  chosen  to  maximize  the  prob¬ 
ability  of  detection  at  some  acceptable  level  of  false 
detections.  Where  signals  are  found  that  are  above 
threshold,  features  are  extracted,  producing  an  ab¬ 
straction,  or  symbolic  representation,  of  the  raw  data, 
(liven  the  set  of  extracted  features,  a  signal  is  then 
classified  as  either  one  of  the  object  types  being  sought 
or  as  clutter.  In  both  the  existing  gust  front  detection 
algorithms  and  the  traditional  AI  R  systems,  detec¬ 
tion  is  generally  unsophisticated:  the  threshold  is 
applied  either  to  raw  radar  data  or  to  a  simple  trans¬ 
formation  (such  as  a  matched  filtering)  of  the  raw 
data.  Sophisticated  machine  intelligence  techniques 
are  generally  applied  in  the  form  of  classifiers,  e.g.,  bv 
the  use  of  neural  network,  statistical,  or  model-based 
classifiers. 


However,  the  use  of  machine  intelligence  onlv  for 
the  classification  process  leads  to  a  problem.  With  the 
application  of  a  detection  threshold,  a  significant 
amount  of  information  is  discarded,  ineluding  those 
object  signatures  which  are  weak  or  ambiguous.  Our 
belief  is  that  increased  detection  i  ibilitv  can  be 
achieved  bv  applying  machine  intelligence  techniques 
prior  to  the  application  of  detection  thresholds. 

A  framework  tor  applying  machine  intelligence 
techniques  at  the  earliest  levels  of  signal  (image)  pro¬ 
cessing  is  provided  bv  the  Kxperimental  Target  Recog¬ 
nition  System  (X  I  RS)  |1],  a  general-purpose  ma¬ 
chine  intelligence  approach  to  AI  R  developed  at 
Lincoln  Laboratory.  Specific  techniques  of  knowl¬ 
edge-based  signal  processing,  fuzzy  set  theory,  and 
pixel-level  maps  of  spatial  evidence  are  all  part  of  this 
approach.  Based  on  X  I  RS,  a  Machine  Intelligent 
(lust  Front  Algorithm  (MIGFA)  has  been  constructed 
for  use  with  both  I  DWR  and  ASR-‘)  WS1*  imagery. 
Of  the  two  radar  systems,  the  ASR-1)  presents  the 
greatest  challenge  to  gust  front  detection  because  of 
its  lower  sensitivity  and  less  reliable  Doppler  measure¬ 
ments  in  clear  air.  Thus,  this  article  will  focus  on  the 
ASR-9  WSR  version  of  MK'iFA  to  demonstrate  best 
the  algorithm’s  effectiveness. 

Gust  Fronts 

An  intense  thunderstorm  downdraft  can  arise  from 
various  processes  such  as  evaporative  cooling  and  fric¬ 
tional  drag  between  water  droplets  and  the  air.  Upon 
impact  with  the  ground,  the  downdraft  is  deflected 
horizontally  (Figure  1),  producing  a  local  region  of 
divergent  winds.  The  downdraft  feeds  an  outflow  of 
outwardly  expanding  cool  air.  At  the  leading  edge  of 
the  outflow  exists  a  boundary  where  cool  outflow  air 
collides  (converges)  with  the  warmer  ambient  air. 

I  his  leading-edge  boundary,  called  a  gust  front,  can 
grow  to  be  many  kilometers  long  and  can  propagate 
far  away  from  the  generating  storm. 

The  turbulence  within  a  gust  front  can  be  severe 
enough  to  present  a  danger  to  aircraft  during  takeoff 
and  landing.  And,  because  the  prevailing  winds  be¬ 
hind  a  gust  front  can  persist  for  a  long  time,  the 
passage  of  a  gust  front  over  an  airport  often  necessi¬ 
tates  a  change  of  active  runwav.  When  unanticipated, 
a  gust  front  can  delay  airport  operations  as  aircraft  are 


188 


•  DHANOY  AND  I  ROXfcl 

M.hiwii  lntiHi"inl  (itiyt  f-rottl  Ih  In  Unit 


rerouted  to  a  different  runwav.  Aside  from  issues  ot 
cost  and  inconvenience,  delays  can  increase  the  risk 
ot  potentially  taial  human  errors  as  the  distance 
between  aircraft  that  are  taking  off  or  landing  de¬ 
creases  and  the  work  load  on  air  traffic  controllers 
increases.  With  sufficient  warning,  though,  control¬ 
lers  can  incorporate  in  their  plans  a  change  in  active 
runway  at  the  anticipated  time  of  a  gust  front's  arrival, 
thereby  minimizing  the  hazards  and  costs  associated 
with  delays. 


Ciiist  fronts  can  be  detected  m  Doppler  radar  im¬ 
agers  on  the  basis  of  three  physical  properties:  veloc¬ 
ity  convergence,  thin  lines,  and  motion,  f  igure  2 
shows  a  typical  gust  front  in  both  TDWR  and  ASR-1) 
NX'S!*  images. 

The  air  within  and  behind  a  gust  front  converges 
with  the  ambient  air  ahead  of  the  gust  front.  In  a 
Doppler  velocity  image,  this  activity  is  observable  as  a 
boundary  between  regions  ot  converging  velocities. 
When  viewed  along  a  single  radial,  the  convergence 


Reflectivity  thin-line  signature  Velocity-convergence  signature 


Reflectivity  thin-line  signature  Velocity-variance  signature 


FIGURE  2.  An  example  gust  front  in  (a)  TDWR  and  (b)  ASR-9  WSP  images.  The  left  radar  plots  are  reflectivity  images  with 
units  in  dBZ.  The  right  radar  plots  are  Doppler  images  with  units  in  m/sec.  The  different  signatures  (see  main  text)  of  the 
gust  front  have  been  indicated  by  a  human  interpreter. 


189 


•  DELANOY  AND  TROXEL 

Machine  Intelligent  (Inst  I  rani  Detection 


signature  is  characterized  by  a  relatively  sharp  de¬ 
crease  in  radial  Doppler  values  with  distance  (Figure 
3).  Because  Doppler  radars  can  measure  only  the 
component  ot  the  wind  that  is  directed  along  the 
beam,  Doppler  velocity  measurements  can  often  un¬ 
derestimate  the  true  wind  speed.  In  the  extreme,  the 
convergence  signature  of  a  gust  front  disappears  com¬ 
pletely  when  the  direction  ot  motion  is  perpendicular 
to  the  radar  beam  azimuth.  The  TDWR  velocity 
image  shown  in  Figure  2(a)  demonstrates  this  prob¬ 
lem.  In  the  figure,  the  portion  of  the  front  closest  to 
the  radar  sire  has  a  direction  ot  morion  that  is  nearly 
radially  aligned,  resulting  in  a  pronounced  conver¬ 
gent  boundary  for  that  area.  Flowever,  at  the  ends  of 
the  gust  front,  where  the  direction  of  motion  is  more 
azimuthal,  the  boundary  is  more  difficult  to  detect. 

The  thin-line  signature  is  generally  thought  to  be 
produced  by  the  concentration  of  scatterers  (dust, 
insects,  rain  droplets)  along  the  leading  edge  of  the 
thunderstorm  outflow.  Some  gust  fronts  produce  a 
distinctive  cloud  formation  along  the  gust  front,  which 
can  also  contribute  to  the  thin-line  reflectivity.  The 
thin  line  varies  in  width  but  seldom  exceeds  3  km. 
Typical  maximum  reflectivities  reported  by  the  ASR- 
9  along  gust  fronts  are  in  the  range  of  10  to  20  dBZ. 
But  significant  portions  of  many  thin  lines  can  have 
reflectivities  as  low  as  -5  dBZ,  which  is  near  or  below 
the  threshold  of  detectability  for  the  ASR-9.  (Note: 
The  basic  unit  of  measurement  for  radar  reflectivity  is 
dBZ.  Reflectivities  of  50  dBZ  or  more  are  typical  of 
intense  thunderstorms  with  heavy  rain.  Background 
typically  has  reflectivity  values  between  -15  and  0 
dBZ.)  Because  of  ground-clutter  obscuration,  the  qual¬ 
ity  of  a  thin-line  signature  often  degrades  at  close 
range,  and  the  signature  can  even  vanish  as  the  gust 
front  passes  over  the  radar.  As  the  front  moves  out  of 
the  cluttered  region,  the  signature  often  reestablishes 
itself.  This  type  of  degradation  is  especially  trouble¬ 
some  for  the  ASR-9  because  of  the  radar’s  on-airport 
location,  which  makes  it  more  prone  to  detection  loss 
when  a  gust  front  is  affecting  the  airport. 

A  final  key  gust  front  signature  is  motion.  When 
sequential  radar  scans  are  compared,  convergence  and 
thin-line  signatures  of  a  gust  front  will  move  con¬ 
spicuously  in  a  direction  perpendicular  to  the  orienta¬ 
tion  of  the  convergence  boundary  and  reflectivity 


Velccities-convergence  zone 


FIGURE  3.  Example  velocity-convergence  signature  asso¬ 
ciated  with  a  gust  front. 


thin  line.  Signatures  that  do  not  move  are  either  not 
gust  fronts — e.g.,  they  could  be  false  alarms  from 
range-ambiguous  echoes  (discussed  in  the  subsection 
“Feature  Detection”),  edges  of  storm  regions,  or  ground 
clutter — or  they  are  gust  fronts  that  are  not  opera¬ 
tionally  significant.  Within  limits,  gust  fronts  tend  to 
move  uniformly  as  outwardly  expanding  curved 
boundaries;  i.c„  the  propagation  speed  tends  to  be 
consistent  along  the  front's  length  and  across  time.  Of 
course,  when  gust  fronts  collide,  the  motion  may 
become  more  erratic. 

If  these  signatures  were  100%  reliable,  detectii  n 
would  be  a  trivial  task.  For  some  gust  fronts,  however, 
one  or  more  signatures  may  be  weak,  ambiguous,  or 
entirely  absent.  For  example,  convergence  signatures 
disappear  when  the  radar  beam  is  perpendicular  to 
the  wind  velocity.  Reflectivity  thin  lines  and  thin-line 
motion  can  disappear  when  a  gust  front  is  obscured 
by  storm  regions.  To  complicate  matters  further,  none 
of  these  signatures  are  unique  to  gust  fronts.  Vertical 
shears,  often  present  in  severe  thunderstorms,  can 
bias  low-altitude  velocity  estimates,  producing  appar¬ 
ent  convergence  signatures.  Range-ambiguous  ech¬ 
oes,  ground  clutter,  flocks  of  birds,  and  elongated 
patches  of  low-intensity  precipitation  can  all  appear 
as  reflectivity  thin  lines.  Motion  can  be  associated 
with  anything  (e.g.,  clouds  or  airborne  dust)  that 
follows  the  ambient  wind.  In  short,  each  signature 
can  be  missing  and  each  signature  can  be  mimicked 
by  other  observable  phenomena.  Consequently,  suc- 


190 


III,  '71 


•  DELANOV  AND  TROXEL 

Machine  Intelligent  (iu>t  I  nnit  Detection 


cess  till  discrimination  requires  knowledge  ot  the  cir¬ 
cumstances  tor  which  these  signatures  are  reliable  as 
well  as  knowledge  ot  gust  front  behavior.  Onlv  bv 
weighing  the  quality  ot  several  signatures  sinnilta- 
neouslv  can  an  automated  svstem  detect  trust  fronts 

■  -  O 

with  near  human  performance. 

The  task  is  difficult  enough  with  TDWR  data. 
And  yet  the  TDWR  is  a  pencil-beam  radar,  designed 
tor  weather  sensing,  with  enough  sensitivity  to  gener¬ 
ate  reliable  Doppler  values  in  relatively  clear  air  and 
enough  resolution  in  elevation  to  provide  three- 
dimensional  images  of  weather  phenomena.  In  con¬ 
trast,  the  ASR-9  is  a  surveillance  radar  that  was 
not  originally  intended  tor  weather  imaging.  With 
a  tan-beam  design,  the  ASR-9  vertically  integrates 
signals  into  a  single  two-dimensional  representation. 
Because  the  transmitted  energy  is  distributed  over  a 
wider  arc  ot  elevation,  the  energy  returned  from  a 
low-altitude,  low-reflectivity  gust  front  will  be  small 
relative  to  the  energy  filling  the  remainder  ot  the 
sample  volume.  With  this  reduced  sensitivity,  gust 
front  detection  is  much  more  difficult.  Almost  all 
convergence  signatures  are  eliminated  for  the  ASR-9 
because  the  Doppler  values  are  unreliable  since  the 
reflectivity  returns  from  clear  air  are  below  the  thresh¬ 
old  ot  detectability  tor  the  radar.  Even  for  cases  in 
which  gust  fronts  pass  through  regions  of  high 
reflectivity,  convergence  cannot  be  used  reliably  for 
gust  front  detection.  For  example,  the  signal  contri¬ 
bution  from  overhanging  precipitation  near  the  edges 
of  storms  can  bias  the  low-level  wind-velocity  esti¬ 
mate  when  there  is  vertical  wind  shear.  Without  con¬ 
vergence  signatures,  thin  line  and  thin-line  motion 
become  the  primary  signatures  for  detecting  gust  fronts 
in  ASR-9  WSP  imagery.  In  the  example  ASR-9  WSP 
reflectivity  image  ot  Figure  2(b),  the  gust  front  is 
visible.  But  note  that  while  the  I  DWR  thin  line  is 
quite  strong,  the  ASR-9  WSP  thin  line  shows  less 
contrast,  is  somewhat  more  fragmented,  and  does  not 
extend  as  far  as  is  apparent  in  the  TDWR  data. 

Although  a  convergence  signature  is  missing  from 
the  ASR-9  WSP  velocity  image  of  Figure  2,  the  gust 
front  is  still  visible.  1  he  accuracy  of  velocity  estima¬ 
tions  degrades  markedly  over  the  range  of  signal-to- 
noise  values  associated  with  low  reflectivity  returns. 
For  this  reason,  gust  fronts  are  observable  in  ASR-9 


WSP  velocity  images  as  bands  ot  low-variance  Dop¬ 
pler  values,  with  high  variance  in  the  low  signal-io- 
noise  regions  ahead  and  behind  the  gust  front.  This 
velocity-variance  thin  line  is  an  alternative  signature 
used  in  the  ASR-9  WSP  version  ot  M1GFA.  In  addi¬ 
tion,  implicit  zones  ot  convergence  can  be  identified. 
Doppler  values  within  the  gust  front  thin  line  are 
used  to  estimate  winds  behind  the  gust  front.  The 
environmental  low-level  wind  velocity  ahead  of  the 
storm  can  be  measured  bv  some  other  means — tor 
example,  from  a  network  ot  anemometers  at  the 
airport.  A  comparison  of  these  two  wind-velocirv 
estimates  can  be  used  to  confirm  that  convergence 
exists  somewhere  between  the  trust  trout  and  the 
anemometer  site. 

Background 

Automated  radar  gust  front  detection  algorithms  have- 
been  underdevelopment  and  evolution  for  almost  ten 
years.  H.  Uyeda  and  D.  Zrnic  |2]  first  described  an 
automated  detection  algorithm,  developed  tor  the  Next 
Generation  Weather  Radar  (NEXRAD),  that  was 
based  solely  on  detecting  velocity  convergence  along 
radials.  The  algorithm  was  successful  in  locating  and 
tracking  the  strong  gust  fronts  that  commonly  occur 
in  Oklahoma  during  the  spring. 

An  improved  version  of  the  initial  algorithm  re¬ 
duces  false  alarms  by  requiring  vertical  association  of 
gust  front  '-gnatures  from  two  different  low-altitude 
elevation  scans.  The  improved  algorithm,  known  as 
the  Gust  Front  Detection  Algorithm  (GFI)A),  also 
incorporates  a  technique  for  estimating  horizontal 
winds  ahead  and  behind  detected  gust  fronts  [3,  4], 
As  with  its  predecessor,  GFDA  detects  velocity  con¬ 
vergence  along  radials.  GFDA  is  the  algorithm  cur¬ 
rently  intended  for  use  in  the  initial  operational  de¬ 
ployment  of  TDWR  systems. 

Briefly  described,  GFDA  begins  with  a  search  in 
each  radial  for  runs,  or  segments,  ot  decreasing  radial 
velocity,  indicating  convergent  shear.  Segments  in 
which  the  maximum  shear  exceeds  a  predetermined 
threshold  are  logically  grouped  into  features  on  tile- 
basis  of  end-point-proximity  and  segment-overlap 
tests.  The  feature  attributes  are  then  tested  against  a 
number  of  thresholds  and  are  kept,  discarded,  or 
combined  with  other  features.  After  separately  pro- 


191 


•  DEIANOY  AND  TROXEL 

Machine  intelligent  (mst  From  Detection 


cessing  each  of  the  two  hill-circle  scans  from  different 
altitudes,  the  algorithm  tests  for  vertical  continuity  of 
the  features  between  the  scans.  Features  that  exhibit 
vertical  continuity  and  that  exceed  a  minimum-length 
threshold  are  declared  to  be  gust  fronts.  The  reported 
location  of  the  detected  gust  front  is  determined  by 
fitting  a  curved  line  through  the  peak  shear  of  each 
segment  in  the  gust  front  feature.  Sequential  detec¬ 
tions  are  associated  over  time  to  build  detection  histo¬ 
ries  for  each  gust  front  upon  which  propagation  speeds 
are  estimated  and  forecasts  generated. 

Lincoln  Laboratory,  in  conjunction  with  the  Na¬ 
tional  Severe  Storm  Laboratory  (NSSL),  has  since 
developed  the  Advanced  Gust  Front  Algorithm 
(AGFA)  [ S,  6],  which  contains  several  enhancements, 
including  reflectivity  thin-line  detection.  AGFA  de¬ 
tects  thin  lines  by  finding  local  maxima  of  reflectivity 
values  that  are  consistent  with  the  widths  and  intensi¬ 
ties  associated  with  gust  fronts.  Thin-line  segments 
are  generated  twice:  once  by  constructing  segments 
over  all  range  gates  along  a  radial  and  once  by  con¬ 
structing  segments  across  radials  along  arcs  of  con¬ 
stant  range.  The  final  thin-line  features  consist  of  lists 
of  the  points  connecting  the  centers  of  each  of  the 
segments.  Convergence  and  thin-line  features  are  fused 
on  the  basis  of  end-point  proximity  and  orientation. 
AGFA  does  not  use  motion  as  a  signature  for  detect¬ 
ing  gust  fronts.  Motion  is  used  only  in  heuristics  that 
reject  false  features  after  they  have  been  extracted. 

During  field  testing  in  1990  and  1991,  a  custom¬ 
ized  version  of  AGFA  was  used  for  gust  front  detec¬ 
tion  on  an  ASR-9  WSP  [7,  8).  Because  of  the  lack  of 
reliable  velocity-convergence  features,  the  ASR-9  ver¬ 
sion  of  AGFA  was  configured  to  operate  in  a  thin- 
line-only  detection  mode.  Although  the  algorithm 
was  successful  in  detecting  gust  fronts  that  had  thin- 
line  signatures  of  good  quality,  it  had  some  difficulty 
detecting  gust  fronts  when  the  reflectivity  was  weak 
or  fragmented.  Lacking  convergence  signatures  to  con¬ 
firm  the  existence  of  gust  fronts,  the  algorithm  was 
prone  to  false  alarms  triggered  by  elongated  low- 
reflectivity  weather  echoes  that  are  sometimes  associ¬ 
ated  with  stratiform  rain.  Installing  suboptimal  detec¬ 
tion  thresholds  to  reduce  the  false-alarm  rate  further 
reduced  the  detection  probabilities. 

In  the  above  study,  the  scoring  was  done  against 


human  interpretations  of  the  same  images  used  as 
input  to  the  algorithm.  1  he  discrepancy  between  hu¬ 
man  and  AGFA  performance  appears  to  be  partially 
due  to  AGFA’s  not  making  full  use  of  a  variety  of 
additional  information  that  is  available  in  the  ASR-9 
WSP  data,  including  velocity  thin  lines  and  thin-line 
motion.  Moreover,  both  GFDA  and  AGFA  rely  on 
sequentially  applied  thresholds  to  discriminate  gust 
fronts  from  background.  When  the  relevant  signals 
are  weak  or  ambiguous,  the  use  of  thresholds  in  the 
early  stages  of  processing  can  result  in  the  elimination 
of  potentially  relevant  information,  thus  setting  un¬ 
necessary  limits  on  detection  performance.  GFDA 
and  AGFA  also  rely  on  one-dimensional  signal  pro¬ 
cessing  operations  to  locate  gust  fronts.  1  he  extrac¬ 
tion  of  chains  of  points  across  the  second  dimension 
is  done  at  a  higher,  heuristic  level  of  processing.  In 
contrast,  two-dimensional  signal  processing  opera¬ 
tions  can  directly  establish  the  shape  of  gust  fronts 
without  relying  on  heuristics.  Finally,  these  early  gust 
front  algorithms  have  no  systematic  means  of  condi¬ 
tionally  fusing  information  from  various  sources  by 
taking  into  account  the  different  reliabilities  of  the 
sources.  Different  signatures  can  have  varying  reliabil¬ 
ity  depending  on  the  situational  context. 

Low-Level  Machine  Intelligence 

The  conventional  wisdom  in  computer  vision/object 
recognition  research  has  been  to  use  general  image 
processing  operations,  ideally  devoid  of  object-  and 
context-dependent  knowledge,  at  the  initial  stages  of 
processing.  Such  operations  might  include  edge  de¬ 
tection,  segmentation,  cleaning,  and  motion  analysis. 
And  yet  the  ideal  has  never  really  been  achieved  in 
practice.  For  example,  some  knowledge  of  the  sensor 
and  the  expected  scene  contents  must  be  implicitly 
encoded  in  the  form  of  thresholds  or  other  similar 
parameters  to  detect  edges  effectively. 

From  the  results  of  such  general  operations,  image 
characteristics  are  extracted  and  represented  symboli¬ 
cally.  Machine  intelligence  is  then  applied,  as  if  by 
definition,  only  on  the  symbolic  representations  at 
higher  levels  of  processing. 

M1GFA  has  inherited  the  development  environ¬ 
ment,  control  structure,  knowledge-based  signal  pro¬ 
cessing,  and  several  other  important  attributes  of 


192 


•  DELANOY  AND  TROXEL 

iKuUijpat  dust  i  ron:  Deteaiuu 


X  I  RS.  In  contrast  to  more  conventional  approaches 
to  object  recognition,  sensor-,  object-,  and  context- 
dependent  knowledge  is  applied  in  the  earliest  levels 
of  processing,  i.e.,  at  the  image  processing  stage.  As 
used  in  MIGFA,  low-level  machine  intelligence  ap¬ 
plies  knowledge  in  three  wavs. 

hirst,  knowledge  of  the  current  environment  is 
used  to  choose  from  a  library  those  feature  detectors 
which  are  selectively  indicative  of  the  object  being 
sought.  Using  multiple  independent  feature  detec¬ 
tors,  MIGFA  can  adapt  to  different  contextual  cir¬ 
cumstances.  At  the  beginning  of  the  processing  of 
each  scan,  a  rule-based  expert  examines  contextual 
information  to  select  a  set  of  feature  detectors  known 
through  experience  to  be  the  most  effective  for  a 
given  set  of  circumstances.  In  the  extreme,  this  pro¬ 
cess  would  enable  MIC  FA  to  adapt  itself  dynamically 
to  changes  in  the  environment.  Currently,  the  only 
rule  used  by  MIGFA  selects  between  two  fixed  alter¬ 
native  sets  of  feature  detectors,  one  set  customized  for 
the  TDWR  and  the  other  customized  for  the  ASR-9 
WSP.  Because  of  the  redundancy  inherent  in  the  use 
of  multiple  feature  detectors,  MIGFA  tends  to  be 
robust:  the  malfunction  of  a  feature  detector  or  even 
the  absence  of  one  data  source  does  not  necessarily 
halt  processing  and  may  have  only  minor  effects  on 
detection  performance. 

Second,  knowledge  is  also  incorporated  within  fea¬ 
ture  detectors  through  the  design  of  matched  filters 
that  are  customized  to  the  physical  properties  of  the 
sensor,  the  environment,  and  the  object  to  be  de¬ 
tected.  A  new  technique  of  knowledge-based  signal 
processing,  called  functional  template  correlation  (FTC), 
allows  the  construction  of  customized  signal  process¬ 
ing  operations  that  are  more  effective  than  standard 
operations  (see  the  box,  “Functional  Template  Corre¬ 
lation").  The  output  of  FIG  is  a  map  of  numeric 
values  in  the  range  [0,1]  that  indicate  the  degree  of 
match  between  the  pattern  of  pixels  in  an  image 
region  and  the  feature  or  object  encoded  in  the  func¬ 
tional  template. 

Finally,  knowledge  of  the  varying  reliabilities  of  the 
selected  feature  detectors  is  used  to  guide  data  fusion 
and  extraction.  Conditional  data  fusion  is  simplified 
by  using  “interest"  as  a  common  denominator  [9].  An 
interest  image  is  a  spatial  map  of  evidence  for  the 


presence  of  some  feature  that  is  selectively  indicative 
of  an  object  being  sought  (the  output  of  FTC  is  an 
interest  image  as  long  as  the  functional  template  en¬ 
codes  an  indicative  feature).  Higher  pixel  values  re¬ 
flect  greater  confidence  that  the  intended  feature  is 
present  at  that  location.  Using  interest  as  a  common 
denominator,  MIGFA  fuses  data  bv  combining  inter¬ 
est  images  derived  from  various  pixel-registered  sen¬ 
sory  sources.  Using  simple  or  arbitrarily  complex  rules 
of  arithmetic,  fuzzy  logic,  or  statistics,  MIGFA  can 
assimilate  pixel-level  evidence  from  several  coregistered 
sources  into  a  single  combined  interest  image.  Clus¬ 
ters  of  high  values  in  such  combined  interest  images 
are  then  used  to  guide  selective  attention  and  can 
serve  as  the  input  for  object  extraction.  If  done  effec¬ 
tively,  the  combined  interest  image  provides  a  better 
representation  of  object  shape  than  is  evident  in  any 
single  sensor  modality.  Using  these  techniques,  MIGFA 
performs  a  significant  amount  of  knowledge-based 
processing  before  the  application  of  the  first  discrimi¬ 
nating  threshold.  Most  traditional  perception  systems 
apply  one  or  several  thresholds  early  in  the  processing 
as  a  way  of  quickly  reducing  the  amount  of  data  to  be 
processed.  However,  especially  with  ambiguous  data, 
each  applied  threshold  closes  off  options  for  detecting 
an  object.  A  better  strategy — a  strategy  attempted  in 
XTRS  and  MIGFA — is  to  apply  thresholds  only  after 
evidence  from  many  sources  of  information  have  been 
meaningfully  fused  into  a  single  map  of  evidence. 

MIGFA  Design 

The  system  block  diagram  in  Figure  4  is  an  overview 
of  MIGFA  as  configured  for  ASR-9  WSP  data.  In 
preparation  for  processing,  input  images  V  (Doppler 
velocity)  and  DZ  (reflectivity)  from  the  current  radar 
scan  are  converted  from  polar  to  Cartesian  represen¬ 
tation  and  scaled  to  a  useful  resolution.  Image  SD  is  a 
map  of  the  local  standard  deviations  of  V  values.  The 
SD  and  DZ  images  are  then  passed  to  multiple  simple 
independent  feature  detectors  that  attempt  to  localize 
those  features  which  are  selectively  indicative  of  gust 
fronts.  The  outputs  of  each  of  these  feature  detectors, 
most  of  which  are  based  on  some  application  of  FTC, 
are  expressed  as  interest  images  that  specify  evidence 
indicating  where  and  with  what  confidence  a  gust 
front  may  be  present.  The  different  interest  images 


:m  tj'.’i 


193 


•  DELANOY  AND  TROXEL 

Miuhnte  Intelligent  (not  h'nmt  Delation 


FUNCTIONA 

functional  template  correla¬ 
tion  (FTC)  (1,  2J  is  a  generalized 
matched  filter  that  incorporates 
aspects  of  fuzzy  set  theory.  Con¬ 
sider,  as  a  basis  lor  understand¬ 
ing,  the  basic  image  processing 
tool  autocorrelation.  Given  some 
input  image  /,  an  output  image 
O  is  generated  by  matching  a  ker¬ 
nel  K  against  the  local  neighbor¬ 
hood  centered  on  each  pixel  loca¬ 
tion  Ixy.  The  match  score  assigned 
to  each  pixel  O  is  computed  by 
multiplying  each  element  value 
of  K  by  the  superimposed  ele¬ 
ment  value  in  /  and  summing 
across  all  products.  If  the  shape 
to  be  matched  can  vary  in  orien¬ 
tation,  then  the  pixel  Ixy  is  probed 
by  K at  multiple  orientations.  The 
score  assigned  to  Ovy  is  the  maxi¬ 
mum  across  all  orientations. 

FTC  is  fundamentally  the  same 
operation  with  one  important  ex¬ 
ception:  whereas  the  kernel  used 
in  autocorrelation  is  an  array  ol 
image  values  (the  array  is  essen¬ 
tially  a  subimage  of  the  image  to 
be  probed),  the  kernel  used  in 
FTC  is  an  array  of  scoring  func¬ 
tions.  The  scoring  functions  re¬ 
turn  scores  that  indicate  how  well 
the  image  values  match  the  ex¬ 
pectations  of  the  values  at  each 
element  of  the  kernel.  The  set  ol 
all  returned  scores  are  averaged 
and  “clipped”  to  the  continuous 
range  [0,1].  (In  the  clipping  pro¬ 
cess,  those  averaged  scores  which 
arc  less  than  zero  are  assigned  a 


L  TEMPLATE  CC 

value  of  zero  while  those  aver¬ 
aged  scores  which  are  greater  than 
one  are  assigned  a  value  of  one.) 
The  output  of  FTC  is  a  map  of 
these  values,  each  of  which  re¬ 
flects  the  degree  that  the  shape  or 
object  implicitly  encoded  in  the 
functional  template  is  present  at 
that  image  location. 

Consider  as  an  example  the 
functional  template  implementa¬ 
tion  of  a  simple  matched  filter 
designed  to  detect  gust  fronts  in 
reflectivity  data  (Figure  A).  Gust 
fronts  are  observed  as  thin  lines 
of  moderate  reflectivity  (approxi¬ 
mately  0  to  20  dBZ)  that  are 
flanked  on  both  sides  by  low  re¬ 
flectivity  (approximately  -15 
to  0  dBZ).  Figure  A(l)  shows 
the  template  kernel  consisting 
of  integers  that  correspond  to  the 
two  scoring  functions  shown  in 
Figure  A(2),  Elements  of  the 
kernel  that  do  not  correspond  to 
either  of  the  scoring  functions 
form  guard  regions  in  which  im¬ 
age  (i.e.,  reflectivity)  values  are 
ignored  and  have  no  effect  on 
match  scores.  Scoring  function  0, 
corresponding  to  the  flanking  re¬ 
gions  of  low  reflectivity,  returns  a 
maximal  score  of  1.0  for  image 
values  in  the  interval  of  -20  to 
-5  dBZ,  a  gradually  decreasing 
score  for  image  values  in  the  in¬ 
terval  -5  to  10  dBZ,  and  a  score 
of  -2.0  for  image  values  larger 
than  1 0  dBZ.  Scoring  function  1 , 
corresponding  to  the  center  of  the 


R  RELATION 

kernel  where  moderate  reflec¬ 
tivity  values  are  expected,  returns 
maximal  scores  in  the  interval 
5  to  12.5  dBZ  and  gradually 
decreasing  scores  for  both  higher 
and  lower  image  values.  Note  that 
although  very  low  image  values 
can  generate  scores  of  -1.0,  a 
slower  decline  in  score  with  a 
minimum  score  of  0.0  is  returned 
for  image  values  above  the  maxi¬ 
mal  scoring  interval.  This  asym¬ 
metry  is  an  attempt  to  mitigate 
the  obscuring  effects  of  storm  re¬ 
gions  and  other  patches  of  high 
reflectivity. 

In  general,  by  increasing  or  de¬ 
creasing  the  intervals  over  which 
affirming  scores  (i.e.,  scores  >  0.5) 
are  returned,  scoring  functions 
can  encode  varying  degrees  of  un¬ 
certainty  with  regard  to  which  im¬ 
age  values  are  allowable.  In  addi¬ 
tion,  knowledge  of  how  a  feature 
or  object  appears  in  sensor  imag¬ 
ery  can  be  encoded  in  scoring 
functions.  The  interfering  effects 
of  occlusion,  distortion,  noise,  and 
clutter  can  be  minimized  by  the 
use  of  various  design  strategies  [3]. 
As  a  consequence,  matched  filters 
customized  with  FTC  for  specific 
applications  are  generally  more  ro¬ 
bust  than  classical  signal  process¬ 
ing  operations.  In  the  thin-line 
matched-filter  example  shown  in 
Figure  A,  the  filter  does  not  sim¬ 
ply  find  thin  lines,  but  selects 
those  thin  lines  which  have  re¬ 
flectivity  values  within  a  particu- 


•  DELANOY  AND  TROXEL 

Machine  Intelligent  (lust  Front  Detection 


Scoring  function  0 

Scoring  function  1 

_L 


2.0  h 


1,0 


o 

o 

c r 


-1.0 


-2.0 


— ►!  J-*—  480  m 

(1) 


_ 

I  \-+—  Scoring  function  1 

l  \ 

/ 

/ 

_ «_4_ 

1  I _ j _ 1 _ ... 

-10  /  0 

/ 

/ 

/ 

10  20  30  40  50  60 

dBz 

- 

Scoring  function  0 

(2) 

FIGURE  A.  Example  functional  template  for  thin-line  feature  detection:  (1)  index  kernel  and 
(2)  corresponding  scoring  functions.  By  increasing  or  decreasing  the  intervals  over  which  affirming 
scores  (i.e.,  scores  >  0.5)  are  returned,  scoring  functions  can  encode  varying  degrees  of  uncertainty 
with  regard  to  which  image  values  are  allowable.  In  addition,  knowledge  of  how  a  feature  or  object 
appears  in  sensor  imagery  can  be  encoded  in  scoring  functions. 


lar  range.  Furthermore,  the 
matched  filter  can  display  differ¬ 
ential  tolerances  to  image  values 
that  are  higher  or  lower  than  the 
expected  range  of  values.  In  the 
automatic  target  recognition 
(ATR)  systems  developed  at  Lin¬ 
coln  Laboratory,  FTC  has  been 
used  primarily  as  a  direct  one- 
step  means  of  three-dimensional 
object  detection  and  extrac¬ 
tion.  In  the  Machine  Intelligent 
Gust  Front  Algorithm  (MIGFA), 
FTC  is  used  more  as  a  signal  pro¬ 
cessing  tool  for  edge  detec¬ 


tion,  thin-line  filtering  and 
smoothing,  shape  matching,  skel¬ 
etonizing,  and  erosion. 

If  FTC  were  implemented  lit¬ 
erally  as  described  here,  the  com¬ 
putational  expense  would  be  pro¬ 
hibitive  for  most  useful  tasks.  But 
FTC  is  actually  faster  than  auto¬ 
correlation  if  the  input  data  are 
scaled  to  a  fixed  integer  range  (e.g., 
0  to  255)  and  the  scoring  func¬ 
tions  are  implemented  as  a 
precomputed  two-dimensional 
lookup  table  that  is  indexed  by 
a  scoring-function  number  and 


an  image  value. 

References 

1.  R.I..  Delanoy,  J.G.  Verly,  and  D.E. 
Dudgeon,  “Functional  Templates  and 
Their  Application  to  3-D  Object  Rec¬ 
ognition,”  Proc.  Inti.  Conf.  on  Acoustics, 
Speech,  and  Signal  Processing  (ICASSP), 
San  Francisco,  Mar.  1992. 

2.  R.L.  Delanoy  and  J.G.  Verly,  “Com¬ 
puter  Apparatus  and  Method  for  Fuzzy 
Template  Shape  Matching  Using  a  Scor¬ 
ing  Function,"  U.S.  Patent  No. 
5,222, 155  (June  1993). 

3.  R.L.  Delanoy,  J.G.  Verly,  and  D.E. 
Dudgeon.  “Pixel-Level  Fusion  Using 
Interest  Images,”  Technical  Report  979, 
MIT  Lincoln  Laboratory  (26  Apr. 
1993). 


Vttms.1t  ('  NIIMBFR  1  !«<8  IHf  IWtllN  UBURA'OR,  .KWIftM 


195 


•  DKLANOY  AND  TROXH. 

Mihbnir  hilii/igcm  (m>t  Iruin  Dehitw/t 


Input  Feature  detectors  Output 


FIGURE  4.  Block  diagram  of  the  Machine  Intelligent  Gust  Front  Algorithm  (MIGFA).  For  a  description  of  the  different 
feature  detectors,  see  the  subsection  "Feature  Detection"  in  the  main  text. 


are  fused  to  form  a  combined  interest  image,  thus 
providing  an  overall  map  of  evidence  indicating  the 
locations  of  possible  gust  fronts. 

From  the  combined  interest  image,  fronts  are  ex¬ 
tracted  as  chains  of  points.  The  chains  extracted  from 
a  radar  scan,  collectively  called  an  event,  are  inte¬ 
grated  with  prior  events  by  establishing  a  point-to- 
point  correspondence.  Heuristics  are  then  applied  to 
reject  those  chain  points  which  have  an  apparent 
motion  that  is  improbable.  The  updated  history 
is  used  to  make  predictions  of  where  points  along 
the  front  will  be  located  at  some  future  time.  Such 
predictions  are  used  in  the  processing  of  subse¬ 
quent  images,  specifically  in  the  feature  detec¬ 
tor  called  ANTICIPATION.  In  the  output  of 
ANTICIPATION,  high  interest  values  are  placed 
wherever  fronts  are  expected  to  be,  thereby  selectively 
sensitizing  the  system  to  detect  gust  fronts  at  specif¬ 


ic  locations.  ANTICIPATION  is  tuned  so  that  it 
will  not  automatically  trigger  a  detection  by  itself 
but,  when  its  output  is  averaged  with  other  interest 
images,  it  will  support  weak  evidence  that  would 
otherwise  be  insufficient  to  trigger  a  detection.  Fig¬ 
ure  5  is  a  summary  of  the  processing  steps  for  an 
example  ASR-9  WSP  scan. 

Image  Preparation 

As  discussed  earlier,  velocity  convergence  is  an  unreli¬ 
able  signature  tor  detecting  gust  fronts  in  ASR-9  WSP 
data.  Gust  fronts,  nevertheless,  are  visible  in  velocity 
images.  Because  of  the  tendency  for  high-pass  clutter- 
filtered  pulse-pair  Doppler  estimates  in  a  velocity  im¬ 
age  to  have  high  variance  in  regions  of  low  signal-ro- 
noi.se  ratios  (SNR),  the  local  velocity  variance  is  higher 
for  an  area  of  clear  air  than  for  an  area  associated  with 
slightly  higher  reflectivity  values.  This  information  is 


196 


Ji>: ;  '3 


•  DELANO Y  AND  TROXEL 

Mill  hine  Intelligent  dust  Front  Detection 


translated  into  a  usable  form  bv  transforming  the 
velocity  image  V  into  a  map  of  local  standard  devia¬ 
tions  (the  SD  image).  At  each  pixel  of  V,  the  standard 
deviation  was  computed  in  the  surrounding  1  x  i 
pixel  neighborhood  and  assigned  to  the  correspond¬ 
ing  pixel  in  SI). 

Pixel  values  for  ali  images  are  scaled  to  die  interval 
0  to  255  to  support  subsequent  FTC  operations  on 
the  input  imagers'.  Each  image  is  tagged  with  the 
scaling  factor  and  offset  necessary  to  translate  scaled 
values  back  to  the  original  physical  values. 

Finally,  the  DZ  and  SI)  images  are  cons'erted 
from  polar  arrays  (240  range  bins  x  256  radials)  to 
Cartesian  arrays  (130  x  130).  Mapping  is  done  bv 


computing  tor  each  element  of  the  Cartesian  array 
the  range  bin  and  radial  at  which  the  corresponding 
value  is  to  be  found  in  the  polar  array.  During  the 
mapping  process,  an  implicit  subsampling  of  the  data 
occurs.  From  an  initial  radial  resolution  of  1  20  m  per 
range  bin  and  pixel  size  in  the  azimuthal  dimension 
decreasing  from  680  m  at  28  km,  the  final  Cartesian 
image  has  a  pixel  resolution  of  480  m  per  pixel. 

Feature  Detection 

Given  contextual  information  of  the  sensor  being 
used,  the  location  of  that  sensor,  and  the  environmen¬ 
tal  conditions,  a  rule-based  expert  selects  an  appropri¬ 
ate  set  of  feature  detectors  for  application  to  the  input 


DZ  SD  INTEREST 


INDEXED  EVENT  HISTORY  PREDICTIONS 


ETA  AT  *  :  20:37:28 

VELOCITY  :  -12.1  e  281 


FIGURE  5.  Processed  scan  summary.  In  the  first  row  are  the  DZ  (reflectivity)  image,  SD  (standard 
deviation  of  velocity)  image,  and  the  combined  interest  image  that  has  been  computed  from  the  DZ 
and  SD  images.  The  second  row  begins  with  the  extracted  indexed  event.  White  pixels  are  those 
points  which  have  been  declared  as  part  of  a  gust  front,  Gray  pixels  are  those  points  which  have  not 
been  tracked  long  enough  to  establish  sufficient  confidence.  In  the  history  frame,  the  current  chain 
is  shown  in  white  and  the  preceding  scans  are  shown  in  shades  of  gray  (darker  shades  indicate 
more  distant  events  in  time).  In  the  predictions  frame,  fat  gray  pixels  indicate  the  10-  and  20-min 
forecasts  of  where  the  fronts  are  expected  to  be.  Also  shown  at  the  bottom  right  corner  are  the 
estimated  time  of  arrival  of  the  next  gust  front  to  cross  the  radar  site,  the  speed  of  the  winds 
measured  inside  the  front  (in  m/sec),  and  the  direction  (in  degrees)  from  which  the  front  is  coming. 


\CVB!t  '  i.V0:\  HBORA’C*.  .-> 


,01  ;;VF  t- 


197 


1)1  I  \M  ))  \\1>  I  Kt  )\ I  I 


FIGURE  6  U'" a  ’Mill  'll  II  .if'  t's,  tli  si ,  ,  i'.i,  ,  v  .  .  A*'  l”  !"1|'  r\7  "  1  1'l.tiit'  i'!  ’  ’  -  ,  1 . ;  M 

i\>th  |l')-  outi’.it  «’*  till'  diffi'MM't  fiMfiiii  ilt  ti-.  ti  ■!  f*  .1  . . .  ,.*  (•  ,  iT.  i'. 

fiMtut  t 1  ill  'll  ■.  !■  I  ■-  I',  i .  •’  r  ’  i  t  t  •>  |  >i  ,ii!  *.  ■  i  > ;  i  -  -  %  ■'  ■t-ii.i'  i  I 'lit  I  t  •  )*  p  I  -SI  ,  ,.f  .1  •  -M' 

l|t  'Hf!  i  lt<  'i  I  h.  i  ,f  lit  . . .  ti’i  •>  f  lie  i.s-.t  ft.HH*-—  (f’!t  .  nit  i  .1  ■  i|ht  »).:'••  -  .  •*  t 1  -  '  ,i  •  a 

th.lt  i  '"Milt,.; !  .it-,  .-mn  i '.I’u  ti-..  .  iiHf  .‘ft  ."tfi  I’M  i  • .  it  I  ■  -- 


Ik  k  I’l  s  Iv  lu,,  i:  nif  iii'i'1  ;  i  itMii  ili  1 1 .  1 1 1 1  -f  mi,  -i  i 


!  1  I  >/  aiul  II  sM  I  t 


1 1 1 ' !  "  1 1  i  1  / ,  1 1  !  i  1 1 

l!;,  \M<'>\\M’. 

ilk  i '!  !u  i  tnl  ilk 

1 1 1 k  ti.llllli 

, I,  U  ,  1 

1  »I  "  t  ll.l!  .i  I 

1  1  >\\  K.  1  111’, 

r  vm  .  vi’  .ili  >n,  iii. 

n  k  ’  1 1 1 1 1 ,  i ,  n  I .  \  -■ 

.1-1  ”  [>/  an 

1 1  sl  > 

.  !  v  S  \  .  !  ! '' 

Mil  ,1  \  I, Mil;,-. 

1'  ,-\|'.llhl,  ,  i  .  1  ii  ’ll  ,  i 

.Ml.  .1,  1,  1  II  !l  lt:.ll  Mill  ' 

ti.  Ii  .  1"!  1’  I 

'.is,.  I  } 

M  ; !  M .  i !  1 1  \  1 

fii.tpr  !(!;.:  ill,  !, 

,  i  [  i 1 ! ,  ,1,  1.  ,  ll'l  I  t 

i  U  'll.  '  l  >1 1  U  I  !  lull 

tiin.linii.il  l , 

.  nipi.it 

,  :  II  a-.!  i , ili 

;l  1  Iv,  i I., 

.liul  .im  'll’  i  'I  h,  i 

1  1 1 . 1 1 !  1 11 1 , 1 M  1 1 1  m  u  ; 

1  mi.  UHii. il 

1 ,  1 1 ;  j ' 

l.il,  l 

in  a,  nidi  ,!  f 

-  !,  .  jtiil.il  \llii, HI;; 

!l  U  i  ill  '  111  >i  .1111  k  ! 

>r';i!  ' 

i!k  a; 

M.  -a 

1  '  !  !  i  Mi,.  ;  |  ^  { 

...  ;  .  >t  t,  .it  1 1 1 1 

ili  1,  i  l"l  ’  .  ,  HiU  i'. 

in, i  ,'ilkt  i, 

i ! v  !  v  .  !<*!N 

1)1  I  wen  Wl)  1  HO\l  1 


FIGURE  7.  Combining  i utmost  images:  weak  evidence.  This  haute  is  smnlni  to  Figme  6  except  that  bin  hie  two  uis!  boot -- 
are  ncit  clearly  visible  in  nil v  of  the  single  interest  images  except  for  the  anticipation  image  In  the  combined  inter eM  manes 
however,  the1  oust  fronts  are  much  more  apparent,  illustrating  bow  the  fusion  of  weak  evidence  from  multiple  scum  e 
enhances  (just  front  detectability 


associated  with  titl'd  fmnts  It.ivc  low  l  hippier  xeloc  it\ 
standard-deviation  values  within  the  from  ,uul  hinli 
values  ahead  and  behind  the  front.  (  onsei|uentl\.  the 
storms;  liiiKtion  lor  the  center  snip  returns  maximal 
scores  lor  low  values  while  tile  scoring  (miction  tor 
the  tlankinp  regions  returns  maximal  scores  lor  h iij.li 
values. 

IV  MOI  ION  and  MVMOI  ION  are  two  thin 
hiu  motion  detectors  that  are  very  similar  to  the  basic 
thin  line  detectors  II  IV  and  I  1  M  V  I  he  detec 
non  of  motion  is  based  on  simple  ditlereiK in".  I  or 
example,  in  IV  MOI  ION  the  IV.  im.ipe  from  a 


previous  scan  produced  approximately  i  min  belotc 
tile  current  scan  is  subtracted  Itom  the  current  IV 
imape.  In  tb.e  dtltcrciiced  IV  linage,  mist  trout'  ap 
pear  as  white  lines  i posit iv c  v allies  at  the  front  s  post  ■ 
(ion  m  the  currem  sc.un  that  are  trailed  In  parallel 
dark  lines  (negative  values  at  tile  fronts  position  in 
the'  previous  scan'.  Although  tuiiciional  templates 
that  can  scan  for  parallel  white  and  dark  thin  lines 
simultaneously  arc  feasible,  these  tvpes  ol  templates 
have  so  far  proven  to  be  too  computationally  expen 
sue  to  operate  yvitlun  the  real  lime  constraints  ol  the 
available  computet  resources.  I  hits  the  existing 


1<I<1 


•  1>H  ANOY  ANO  TROXF1 

M*i i  bint  hihlligt  Ht  (iufi  f  rom 


DZ-MOTION  simplv  looks  lor  thin  lines  ol  positive 
values.  The  functional  template  i,  ...  J  has  a  kernel  that 
is  identical  to  the  one  shim  it  .n  f  igure  A  ol  the  box, 
“functional  Template  ( ioi relation,"  but  the  scoring 
functions  are  somewhat  different  because  ol  the  con¬ 
sequences  ol  differencing.  I  he  feature-detector  SD- 
MOTION  is  similar  to  DZ-MOTION  in  that  SD- 
MO  i  '  JN  also  applies  to  the  diltereuce  ol  two 
sequential  images  a  thin-line  filter  with  customized 
scoring  functions.  With  this  approach,  thin  lines  that 
do  not  move  are  given  low  interest  values,  reflecting 
the  belief  that  a  stationary  thin  line  is  either  not  a  gust 
front  or  is  a  gust  front  that  may  be  ignored.  Because 
the  background  in  differenced  images  is  reduced  to 
values  near  zero,  DZ-MO  I  ION  and  SD-MO 1  ION 
tend  to  be  more  sensitive  than  11.-0/  and  I  I  -SO. 

One  disadvantage  ol  DZ-MOTION  and  SO 
MOTION  is  that  they  tend  to  produce  lalse  alarms 
when  moving  storms  are  present  because  the  leading 
edge  ol  the  storm  may  appear  in  the  differenced 
image  as  a  thin  line  ol  positive  values.  For  reducing 
the  likelihood  of  such  false  alarms,  an  image  ol  storm 
regions  is  generated  with  a  round  functional  template 
whose  kernel  has  a  diameter  ol  13  pixels  (6.23  km). 
Wherever  storm  regions  are  detected  with  this  tem¬ 
plate,  interest  values  are  decreased  in  OX-MOTION 
and  set  to  nil  ( i.e. ,  no  opinion)  in  SO-MO  I  ION . 

A  fifth  feature  detector,  OUT-OF-TRIP,  highlights 
range-ambiguous  echoes.  Range-ambiguous  echoes 
occur  when  signals  are  reflected  by  weather  more 
distant  than  the  maximum  unambiguous  range.  Be¬ 
cause  the  signals  have  traveled  farther,  they  arrive 
back  at  the  radar  receiver  at  the  same  time  as  signals 
that  are  transmitted  later  and  reflected  from  nearer 
weather  (hence  the  name  OUT-OF-TRIP).  For  these 
range-ambiguous  echoes,  the  apparent  range  extent  is 
maintained  while  the  azimuthal  extent  is  reduced 
proportional  to  the  range;  thus  the  signals  have  a 
distinctive  appearance  as  reflectivity  thin  lines  that  are 
radially  aligned  and  that  are  associated  with  high  local 
variance  in  the  Ooppler  data.  Because  ol  their  thin- 
line  appearance,  range-ambiguous  echoes  are  olten 
inappropriately  given  high  interest  values  by  both 
Tl.-DZ  and  DZ-MOTION. 

The  detection  ol  out-of-trip  signals  is  performed 
by  applying  two  functional  templates  simultaneously. 


One  template  looks  for  radial  Is  aligned  thin  lines  in 
the  DZ  image,  while  the  other  requires  that  the  corre¬ 
sponding  SD  values  are  high.  I  he  result  is  an  interest 
image  that  highlights  out-of-trip  signals.  After  the 
combination  ol  all  other  interest  images,  the  oui-ot- 
trip  interest  image  is  subtracted  from  the  combined 
interest  image,  thus  .selectively  suppressing  evidence 
tor  the  presence  ot  gust  fronts  where  out-ot-trip  sig¬ 
nals  are  found,  bxample  outputs  of  the  (H  I 
OF-TRIP  feature  detector  are  shown  in  1  mures  6 
and  T 

The  ANTICIPATION  feature  detector  pros  ides  a 
mechanism,  based  on  situational  context,  for  spatially 
adjusting  the  detection  sensitivity  of  MICil  A.  High 
anticipation  values  get  averaged  with  interest  values 
from  other  feature  detectors  to  increase  the  likelihood 
of  detection  at  specific  locations.  Similarly,  low  amici 
pation  values  suppress  the  likelihood  of  detection. 

The  most  important  use  of  anticipation  is  as  a 
replacement  lor  coasting.  Simply  defined,  coasting  is 
the  continued  tracking  ot  a  target  on  a  radar  screen 
for  some  time  interval  after  the  target  has  disappeared 
(i.e.,  after  the  target's  signal  has  fallen  below  some- 
detection  threshold).  Coasting  assumes  that  the  loss 
of  a  target's  signal  is  not  due  to  a  change  in  the  target's 
behavior  (e.g.,  a  change  in  velocity  or  perhaps  the 
disappearance  of  the  target).  Cost  fronts,  howev  er,  do 
change  their  behavior,  as  in  cases  in  which  gust  fronts 
collide.  Consequently,  the  blind  coasting  ot  a  signal 
after  the  signal  s  loss  is  a  potential  source  1 1  false 
alarms.  As  an  alternative  to  blind  coasting,  anticipa¬ 
tion  provides  a  mechanism  for  progressively  increas¬ 
ing  the  sensitivity  of  a  detection  system,  supporting 
weak  evidence  that  would  otherwise  fall  below  detec¬ 
tion  thresholds. 

In  MICiFA,  prior  history  of  the  behavior  of  a  par¬ 
ticular  gust  front  is  used  to  predict  where  that  front  is 
expected  to  be  in  the  current  scan.  The  predictions 
are  used  to  create  a  band  ot  elevated  interest  values, 
typically  not  so  high  as  to  trigger  a  detection  by 
themselves,  but  high  enough  to  raise  collocated  weak 
signals  above  threshold.  In  general,  as  the  length  of 
time  a  gust  front  has  been  tracked  increases,  the  an¬ 
ticipation  interest  values  can  also  be  increased.  If  ab¬ 
solute  coasting  is  desired,  interest  values  can  be  in¬ 
creased  to  a  level  high  enough  to  trigger  a  detection 


200 


•  DtLANOY  AND  TROXEL 

Miii  hi  He'  Intelligent  (ni't  Front  Deteetion 


without  anv  other  supporting  evidence,  [  samples  ot 
anticipation  interest  images  are  shown  in  f  igures  6 
and  ”. 

Anticipation  can  also  he  used  to  adjust  the  sensitiv  ¬ 
ity  ot  gust  front  detections  on  the  basis  of  contextual 
knowledge.  Some  examples  follow: 

1.  Many  gust  fronts  are  not  observable  in  radar 
data  when  the  fronts  are  directly  over  the  radar 
site  because  of  obscuration  hv  intense  ground 
clutter.  1  veil  with  anticipation  of  where  a  gust 
front  is  expected  to  be,  the  radar  system  can 
often  lose  the  front  as  the  front  crosses  over  the 
radar  site.  lo  prevent  such  a  loss,  absolute  coast¬ 
ing  over  the  radar  site  can  be  accomplished  bv 
setting  interest  values  within  2  km  of  the  radar 
site  to  nil  ( i.e. ,  missing  values)  for  all  interest 
images  except  the  anticipation  image.  Conse¬ 
quently,  the  anticipation  interest  image  will  be 
the  only  image  allowed  to  have  an  opinion  of 
what  exists  directly  over  the  radar  site. 

2.  (lust  front  false  alarms  often  occur  from  thin, 
elongated  bands  of  low-reflectivity  stratiform 
rain.  In  central  Florida  at  least,  gust  fronts  are 
seldom  associated  with  the  stratiform  rain  that 
often  follows  intense  storm  activity.  Hence,  un¬ 
der  such  conditions,  the  ANTICIPATION  fea¬ 
ture  detector  suppresses  the  background  antici¬ 
pation  interest  values. 

3.  False  alarms  are  rare  in  the  absence  of  any  pre¬ 
cipitation.  I  hits,  when  no  precipitation  is  vis¬ 
ible  on  the  radar  screen,  the  background  antici¬ 
pation  interest  values  may  be  safely  raised, 
thereby  increasing  the  likelihood  of  detecting  an 
incoming  gust  front  that  is  generated  bv  a  more 
distant  storm. 

( min biuing  l : i > i deuce 

During  the  feature-detector  selection  process,  a  rule 
of  combination  is  also  chosen  to  govern  the  combining 
of  evidence — an  example  of  data  fusion.  In  principle, 
the  rule  of  combination  can  be  as  simple  as  the  aver¬ 
aging  ot  pixel  values  across  all  interest  images.  How¬ 
ever,  for  the  set  ot  ASR-9  WSP  feature  detectors 
described  earlier,  a  somewhat  more  complicated  rule 
has  been  used. 

The  tour  interest  images  generated  by  TI.-DX, 


Tl.-SD,  DZ-MOFION,  and  SD -MOTION  are  av¬ 
eraged  together.  During  the  process,  anv  missing  val¬ 
ues  are  ignored.  I  he  resulting  averaged  interest  image 
and  the  anticipation  interest  image  are  combined  as  a 
weighted  average:  the  average  ot  the  first  tour  interest 
images  is  given  a  weight  ot  ().”3  while  the  anticipa¬ 
tion  image  is  given  a  weight  of  0.23.  Finally,  elements 
of  the  out-of-trip  interest  image  are  multiplied  bv 
0.23  and  subtracted  from  the  elements  of  the  weighted 
average.  The  resulting  image  is  called  the  combined 
interest  image. 

Figure  6  shows  an  example  ASR-0  WSP  DZ  im¬ 
age.  the  outputs  ot  each  feature  detector,  and  the  final 
interest  image.  In  this  case,  strong  evidence  tor  the 
nvo  fronts  is  visible  in  each  of  the  component  interest 
images  (except,  of  course,  for  the  out-of-trip  image). 
Clearly,  any  one  of  the  feature  detectors  acting  alone 
would  have  been  adequate.  Now  consider  f  igure  ”, 
which  summarizes  the  evidence  for  the  presence  ot 
the  two  gust  fronts  in  a  later  scan  in  which  detection 
has  become  more  difficult  as  accumulating  storm  re¬ 
gions  have  occluded  the  fronts.  Note  that  although 
different  parts  of  the  gust  fronts  are  highlighted  in 
different  interest  i mattes,  the  trust  fronts  are  not  tin- 
ambiguously  visible  in  any  single  interest  image  (ex¬ 
cept  the  anticipation  image).  In  the  combined  inter¬ 
est  image,  however,  the  gust  fronts  are  much  more 
apparent.  This  example  illustrates  how  ev  idence  de¬ 
rived  from  multiple  feature  detectors  can  be  com¬ 
bined  so  that  the  various  detectors  mutually  support 
and  compensate  for  one  other. 

In  MIGFA,  no  one  feature  detector  is  meant  to 
be  a  perfect,  or  even  necessarily  a  good,  discrimi¬ 
nator  of  gust  fronts  and  background.  When  used 
together,  however,  several  weakly  discriminating 
feature  detectors  can  achieve  robust  performance 
depending  on  how  the  detector  outputs  arc- 
combined. 

Extraction 

Algorithms,  such  as  AGFA,  that  track  gust  fronts  as 
entities  must  identify  gust  fronts  prior  to  tracking. 
The  algorithms  rely  on  the  assignment  of  unique 
labels  that  permit  the  establishment  of  correspon¬ 
dence  across  time.  Gust  front  statistics,  such  as  propa¬ 
gation  speed  and  location,  are  computed  for  the  front 


•  DELANOY  AND  TROXEL 

Machine  Intelligent  (lust  Front  Detection 


Scoring  function  0 


(a) 


(W 


FIGURE  8.  The  bow-tie  functional  template  used  for  thin-line  smoothing:  (a)  index  kernel  and 
(b)  corresponding  scoring  functions.  (For  ar  explanation  of  functional  templates,  see  the  box, 
“Functional  Template  Correlation.") 


as  a  whole.  This  approach  is  adequate  for  simple 
cases.  Inevitably,  however,  complex  rules  are  required 
to  handle  the  labeling,  correspondence,  and  tracking 
for  cases  in  which  a  single  front  breaks  up  into  dis¬ 
joint  fragments  or  for  cases  in  which  multiple  fronts 
merge  or  collide.  Given  the  variable  nature  of  gust 
front  behavior,  the  construction  of  a  fully  compre¬ 
hensive  set  of  rules  that  are  correct  for  all  possible 
circumstances  is  a  difficult  task. 

The  problem  is  bypassed  in  MIG  FA  by  making  the 
goal  of  extraction  the  identification  of  all  points  (col¬ 
lectively  called  an  event)  that  lie  in  any  gust  front. 
Certainly,  some  chains  of  points  are  spatially  segre¬ 
gated  or  have  different  velocities.  For  purposes  of 
reporting,  such  chains  can  be  inferred  to  belong  to 
separate  gust  fronts  even  though  there  is  no  concerted 
attempt  to  label  or  track  gust  fronts  as  entities.  In¬ 
stead,  individual  points  are  tracked  across  time;  that  a 
point  belongs  to  one  gust  front  or  another  is  irrele¬ 
vant  to  processing.  MIGFA  predictions  are  elastic  in 


that  the  variable  velocities  of  different  points  along 
the  gust  fronts  are  each  used  to  make  predictions  of 
what  the  gust  front  appearance  will  be  at  some  time  in 
the  future. 

Thin  lines  in  the  combined  interest  image  can  be 
fragmented  for  gust  fronts  that  intersect  with  out-of- 
trip  weather  or  for  fronts  obscured  by  storm  regions. 
To  bridge  gaps  between  coliinear  fragments  and  to 
suppress  random  unaligned  high-interest  values, 
MIGFA  uses  thin-line  smoothing  of  the  combined 
interest  image.  Figure  8  shows  the  bow-tie  functional 
template  used  as  the  basis  for  thin-line  smoothing. 
The  template,  inspired  by  the  receptive  field  of  the 
cooperative  cel!  of  the  Boundary  Contour  System 
developed  by  S.  Grossberg  and  F..  Mingolla  [10],  has 
a  bow-tie  shape  that  weights  the  influence  of  the  end 
regions  over  that  of  the  center  by  placing  more  kerne! 
elements  at  the  ends.  Consequently,  the  template  gen¬ 
erates  high  output  interest  scores  for  an  image  point 
between  two  coliinear  high-inrerest  segments,  even  if 


202  Ml  ,  i Sf.it l  N  i  ABORMfiP'i  .iPURSft: 


VOt  liVf  f.  V.IVHFR  ' 


•  DELANOY  AND  TROXEL 

Machine  Intelligent  Gust  Front  Detection 


that  middle  point  itself  has  a  low  input  interest  value. 
Because  of  the  scoring-function  design,  the  bow-tie 
filter  suppresses  those  collinear  interest  values  which 
are  below  the  level  of  ambiguity  (0.5),  and  amplifies 
those  values  which  are  above  the  level  of  ambiguity. 
With  this  design,  the  boundaries  between  gust  fronts 
and  background  are  sharpened,  resulting  in  cleaner 
shapes  tor  subsequent  processing.  An  example  of  an 
input  image  of  combined  interest  and  an  output 
smoothed  image  are  shown  in  Figure  9. 

A  threshold  of  0.5  is  then  applied  to  the  smoothed 
image  to  create  a  binary  image  of  candidate  fronts. 
The  lengths  of  resulting  elongated  shapes  are  then 
computed,  and  the  elements  of  those  binary  shapes 
which  are  too  short  (<6  km  for  the  ASR-9  WSP)  are 
set  to  0.  The  result  of  this  process  is  shown  in  the 
frame  labeled  “match  >  0.5”  in  Figure  9. 

The  bow-tie  functional  template  also  generates  a 
map  of  orientations.  In  the  orientation  image,  each 
element  indicates  the  orientation  that  is  associated 
with  the  highest-scoring  bow  tie  rotated  at  10°  incre¬ 
ments  from  0°  to  170°.  Black  pixels  correspond  with 
best  matches  at  0°;  white  pixels  correspond  with  best 


COMBINED  INTEREST 


l,  34 


9 


)  -  ■ 


v 


matches  at  170°.  An  example  orientation  image  is 
shown  in  Figure  9. 

The  elongated  binary  shapes  of  the  “match  >  0.5” 
image  can  be  thinned  down  to  a  single-pixel-width 
skeleton  by  using  an  FTC  implementation  of  a  modi¬ 
fied  version  of  S.  Levialdi’s  homotopic  thinning  [11]. 
The  result  of  thinning  is  shown  in  the  frame  labeled 
“marked  thinned”  in  Figure  9. 

The  chains  of  points  resulting  from  thinning  are 
then  extended  along  ridges  of  relatively  high  interest 
by  using  what  is  essentially  a  road-following  algo¬ 
rithm.  At  each  end  point,  the  pixels  immediately 
surrounding  that  point  are  examined  by  looking  out¬ 
ward  from  the  rest  of  the  chain  for  the  maximum- 
interest  pixel  with  an  orientation  (found  in  the  orien¬ 
tation  image)  that  is  within  a  specified  angle  from 
that  of  the  initial  end  point.  When  the  maximum 
interest  score  of  a  new  point  falls  below  0.2  or  when 
no  new  point  has  an  orientation  consistent  with  the 
initial  end  point,  extending  halts.  The  result  of  the 
extending  process  is  shown  in  the  frame  labeled  “ex¬ 
tended”  in  Figure  9. 

After  the  chain-extension  process  has  been  com- 


SMOOTHED  INTEREST 


HATCH  >  0.5 


J' 


ORIENTATION 


MARKED  THINNEO 


EXTENDED 


SELECTED  CHAINS 


FIGURE  9.  Extraction  steps.  Candidate  gust  fronts  are  extracted  from  a  combined  interest  image.  For  a  description  of  the 
different  steps  involved,  see  the  subsection  "Extraction"  in  the  main  text. 


V01 IIMI  s  NUMBER  1  *91,1  THt  UNC0LN  LABORATORY  JOURNXl 


203 


•  DELANOY  AND  TROXEL 

Machine  Intelligent  dust  brunt  Detection 


pleted,  the  resulting  image  may  be  highly  branched 
and  it  may  contain  loops.  For  further  refinement  of 
the  image,  chain  segments  are  assigned  scores  based 
on  the  sum  of  the  corresponding  interest  values  found 
in  the  smoothed  interest  image.  In  each  disjoint  net¬ 
work  of  chain  segments,  the  single  most  interesting 
(usually,  but  not  always,  the  longest)  non-looping 
combination  of  chain  segments  is  extracted  as  the 
candidate  gust  front.  Once  the  most  interesting  chain 
has  been  extracted,  the  process  is  repeated  on  the 
remaining  unextracted  chain  segments  to  find  the 
next  most  interesting  combination  of  chain  segments. 
The  extraction  process  is  repeared  until  the  most 
interesting  remaining  chain  is  below  an  empirically 
determined  interest  threshold.  In  Figure  9,  the  frame 
labeled  “selected  chains"  shows  the  set  of  above-thresh¬ 
old  combined  chain  segments  that  were  extracted 
from  the  “extended"  image. 

Tracking/Heuristics 

As  stated  earlier,  each  point  in  the  extracted  event 
is  tracked  individually.  The  tracking  of  a  particular 
point  requires  that  the  corresponding  point  in  the 
event  immediately  prior  to  the  current  event  be  found. 
Correspondence  can  be  difficult  to  establish  when 
several  gust  fronts  collide;  in  such  cases,  the  point 
in  the  prior  event  that  is  closest  to  a  point  in  the 
current  event  might  not  necessarily  be  the  correct 
corresponding  point.  Consequently,  the  correspond¬ 
ing  point  is  chosen  to  be  the  closest  point  in  the 
immediately  prior  event  for  which  the  orien¬ 
tation  and  speed  are  consistent  with  the  given 
point  in  the  current  event.  If  no  such  point  in 
the  prior  event  is  found,  then  the  corresponding  point 
is  assumed  to  be  the  closest  point.  Once  cor¬ 
respondence  for  a  point  is  established,  the  point 
is  indexed  by  creating  a  pointer  linking  that  point 
to  the  corresponding  point  in  the  immediately 
prior  event.  If  the  distance  between  the  two  corre¬ 
sponding  points  is  too  large  or  if  the  distance  is 
inconsistent  with  prior  history,  then  the  point  is 
unindexed  (i.e.,  the  link  is  broken).  Through  the 
index  links,  a  point  can  be  tracked  backwards 
in  time  to  its  first  recorded  instance.  The  number 
of  prior  events  through  which  a  point  can  be 
tracked  is  called  the  point’s  depth.  (A  depth  of 


0  means  that  the  point  is  unindexed.)  Once 
indexed,  each  point  is  assigned  the  follow¬ 
ing  attributes:  coordinates,  distance  moved,  di¬ 
rection  moved,  depth,  Doppler  value,  interest 
value,  and  propagation  speed. 

After  indexing,  each  extracted  chain  of  points  is 
edited: 

1.  If  the  direction  a  single  point  moves  is  opposite 
(approximate  difference  of  1 80°)  from  its  neigh¬ 
bors,  the  direction  of  the  point  is  reversed. 

2.  Single  chains  may  be  divided  into  two  subchains 
if  a  persistent  discontinuity  in  velocity  or  a  per¬ 
sistent  change  in  orientation  is  detected  at  some 
point  along  the  chain. 

3.  Various  parameters  such  as  propagation  speed, 
Doppler  value,  and  direction  of  motion  are 
smoothed  along  the  length  of  each  chain. 

4.  Heuristics  are  applied  that,  when  satisfied, 
unindex  individual  points  in  a  chain.  If  more 
than  half  of  any  chain’s  points  become  unindexed, 
all  points  in  the  chain  are  unindexed. 

The  heuristics  mentioned  in  item  4  above  are  based 
on  knowledge  of  how  false  alarms  can  be  distin¬ 
guished  from  real  gust  fronts.  For  example,  if  the 
direction  a  point  moves  is  inconsistent  with  the  mea¬ 
sured  Doppler  value,  the  point  is  unindexed.  Or,  if 
the  point  is  approaching  the  radar  site  ard  moving  in 
the  same  direction  and  no  faster  than  the  winds  mea¬ 
sured  by  anemometers  at  the  radar  site  (i.e.,  there  is 
no  convergence),  the  point  is  unindexed. 

In  the  final  stage  of  tracking,  a  binary  decision  is 
made  for  each  chain  as  to  whether  the  chain  should 
be  declared  a  gust  front.  A  chain’s  summed  interest 
score  and  the  depths  of  its  constituent  points  are  used 
to  make  the  decision.  For  chains  with  high  summed 
interest  scores  (reflecting  a  higher  degree  of  con¬ 
fidence),  points  with  lower  depths  may  be  in¬ 
cluded.  On  the  other  hand,  chains  that  have  low 
summed  interest  scores  are  less  likely  to  be  gust 
fronts  and  are  thus  required  to  accumulate 
higher  depths  before  being  included  in  the  an¬ 
nounced  gust  front  detections.  The  frame  labeled 
“indexed  event"  in  Figure  5  shows  the  set  of  all 
extracted  points.  White  pixels  represent  those 
points  which  have  the  sufficient  depths  and  interest 
scores  to  be  reported.  Gray  pixels  represent  those 


204  Hit  |  I  AHflRA . 0*1-  .]0ilR\fil 


VfllUVt  F  \IIVBfR  ' 


•  DELANOY  AND  TROXEL 

Machine  Intelligent  (lust  Front  Detection 


points  which  will  not  be  reported  due  to  a  lack  of 
confidence.  In  the  frame  labeled  “history,”  the  re¬ 
ported  points  are  shown  in  context  with  previously 
reported  events. 

Prediction 

The  current  extracted  event,  indexed  into  the  prior 
history,  is  used  to  predict  the  future  locations  of  those 
points  which  have  the  sufficient  depths  and  interest 
scores.  Given  the  direction  moved,  the  propagation 
speed,  and  the  current  coordinates  of  a  point,  a  new 
coordinate  is  computed  for  some  specified  time  in  the 
future.  Gaps  can  arise  between  the  projected  future 
coordinates  of  two  adjacent  gust  front  points  when 
the  orientations  and  velocities  of  the  points  are  not 
identical.  In  such  cases,  the  gnps  are  filled  in.  An 
example  showing  the  reported  chains  and  their  ex¬ 
pected  locations  after  10  and  20  min  is  shown  in  the 
frame  labeled  “predictions”  in  Figure  5. 


Results 

The  performance  of  M1GFA  has  been  scored  against 
human  interpretations  of  the  same  input  radar  data. 
Implicit  in  this  statement  is  the  assumption  that  hu¬ 
man  interpretations  are  1 00%  accurate.  As  we  will  see 
later,  this  assumption  is  not  always  correct. 

The  human  interpreter  had  access  to  both  Doppler 
and  reflectivity  images  for  an  entire  sequence  of  ASR- 
9  WSP  scans,  which  could  be  viewed  separately  or  in 
sequence  as  a  movie.  For  each  scan,  a  description  of 
“truth"  (i.e.,  the  interpretation  of  the  scan  by  a  hu¬ 
man)  was  stored  in  a  table  as  a  list  of  coordinates 
marking  the  gust  front  end  points  and  an  intermit¬ 
tent  sampling  of  points  in  between.  For  categoriza¬ 
tion  of  results,  the  estimated  maximum  wind  shear  in 
the  zone  of  convergence  was  also  stored.  This  scoring 
exercise  was  intended  to  measure  MIGFA’s  detection 
performance,  not  the  end-to-end  gust  front  detection 


FIGURE  10.  Human  versus  MIGFA  interpretation  of  ASR-9  WSP  data.  The  5-km-wide  box  denotes  a  region  where  a  hu¬ 
man  interpreter  has  detected  a  gust  front.  The  single  line  represents  a  detection  by  MIGFA.  Note  that  the  human  interpreter 
did  not  include  the  extreme  ends  of  the  front  because  the  ends  were  nearly  radially  aligned  and  had  weak  reflectivity 
values — characteristics  of  out-of-trip  weather.  However,  because  the  extended  thin  line  moved  consistently  with  the  center 
of  the  front  and  because  the  variance  of  Doppler  velocity  values  associated  with  the  thin  line  was  too  low  to  be  out-of-trip 
weather,  MIGFA  probably  gave  the  more  likely  interpretation.  The  reflectivity  is  given  in  dBZ,  and  the  velocity  in  m/sec. 


VOlllVt  6  NUVBH  •  '993  IHE  IHCOU  JOdlMl 


205 


•  DELANOY  AND  TROXEL 

Mat  Intelligent  (iu>t  l- rant  Detection 


capability  lor  the  ASR-9  WSP.  Consequently,  the  hu¬ 
man  interpreter  was  restricted  to  including  in  the 
truth  set  only  those  gust  fronts  which  had  some  vis¬ 
ible  signature,  however  subtle.  Other  data  sources, 
such  as  matching  TDWR  data  and  anemometer  mea¬ 
surements  of  winds  over  the  radar  site,  were  used  to 
confirm  or  deny  the  existence  of  gust  fronts  that  had 
an  ambiguous  appearance  in  ASR-9  WSP  data.  The 
interpreter,  however,  did  not  use  these  other  data 
sources  to  define  gust  fronts  in  the  absence  of  visible 
ASR-9  WSP  signatures.  For  cases  in  which  MIGFA 
detections  in  ASR-9  WSP  data  were  scored  against  a 
human  interpreter  looking  at  TDWR  data,  the  same 
procedures  were  used  to  generate  the  TDWR  truth 
tables. 

An  automatic  scoring  procedure,  described  in  de¬ 
tail  by  D.  Klingle- Wilson  et  al.  [12],  compares  com¬ 
puted  gust  front  detections  with  human-generated 
truth  (see  Figure  10).  Briefly  described,  the  scoring 
algorithm  draws  lines  that  connect  the  sequence  of 
coordinates  encoding  the  human-estimated  limits  of 
a  gust  front.  The  lines  are  then  expanded  to  a  5-km- 
wide  region  that  is  called,  in  this  article,  a  truth  box. 
Computed  gust  front  detections  overlapping  with  some 
portion  of  the  truth  box  are  counted  as  successful 
detections  while  those  not  overlapping  are  counted  as 
false  alarms.  A  probability  of  detection  (POD)  is 
computed  by  dividing  the  number  of  successfully 
detected  fronts  by  the  number  of  fronts  identified  by 
the  ’  iman  interpreter.  The  probability  of  a  false  alarm 
(PFa)  is  the  number  of  false  alarms  divided  by  the 
total  number  of  algorithm-generated  detections.  (Note: 
In  this  article,  POD  and  PFA  values  will  be  expressed 


as  percentages.)  In  addition  to  the  hit-or-miss  POD 
and  PFA  scores,  scoring  is  also  done  in  terms  of  the 
percent  overlap  of  computer-generated  detections  and 
truth  boxes.  The  percent  length  detected  (PLD)  is  the 
number  of  points  in  an  algorithm-generated  detec¬ 
tion  that  fall  within  a  truth  box  divided  by  the  length 
of  that  truth  box  (in  pixels).  The  percent  false  length 
detected  (PFD)  is  the  number  of  points  in  an  algo¬ 
rithm-generated  detection  that  fall  outside  any  truth 
box  divided  by  the  total  number  of  algorithm-gener¬ 
ated  gust  front  points. 

One  improvement  to  this  method  is  the  use  of  a 
MAYBE  category  of  truth.  Often  gust  fronts  or  parts 
of  gust  fronts  are  only  marginally  detectable,  forming 
a  gray  area  in  which  the  human  observer  is  unT  cided 
or  uncertain.  If  an  algorithm  detects  a  weak  gust  front 
associated  with  an  ambiguous  signature,  the  detection 
should  not  count  as  a  false  alarm.  Similarly,  if  the 
algorithm  misses  a  gust  front  that  is  too  weak  to  have 
any  operational  significance,  the  miss  should  not  af¬ 
fect  the  POD  and  PLD  scores.  Radar  image  features 
that  are  categorized  as  MAYBE  are  omitted  from 
scoring. 

Table  1  compares  the  performance  of  MIGFA 
against  the  latest  version  of  AGFA,  which  uses  more 
conventional  methods  of  signal  processing  and  com¬ 
puter  vision.  The  test  set  of  ASR-9  WSP  data  col¬ 
lected  in  Orlando,  Florida,  during  field  testing  in 
1991  contained  nine  different  moderately  strong  gust 
fronts  tracked  through  15  hours  (372  images).  A 
human  interpreter  looking  at  the  same  data  detected 
280  instances  of  the  nine  gust  fronts.  The  first  two 
columns  of  Table  1  indicate  that  MIGFA  increased 


Table  1.  AGFA  and  MIGFA  Performance*  on  ASR-9  WSP  Data 

Gust  Fronts  Gust  Front  Length 


Probability  of 

Probability  of  a 

Percent  Length 

Percent  False  Length 

Detection  (POD)** 

False  Alarm  (PFA)** 

Detected  (PLD) 

Detected  (PFD) 

Baseline  (AGFA)  56.7 

4.6 

38.9 

12.9 

MIGFA  88.1 

0.6 

86.2 

33.4 

*  As  scored  against  human  interpretations  of  ASR-9  WSP  data 
**  Expressed  as  a  percent 


206  ’ H[  :  •*  :  ABOBA'OR*  JOURNAi  V)\\Xi  fi  MlMBIR 


•  DP.LANOY  AND  TROXtL 

Machine  Intelligent  (nut  From  Detection 


Table  2.  AGFA  and  MIGFA  Performance*  on  ASR-9  WSP  Data 


Gust  Fronts 


Gust  Front  Length 


Probability  of 
Detection  (POD)** 


Probability  of  a 
False  Alarm  (PFA)** 


Percent  Length 
Detected  (PLD) 


Percent  False  Length 
Detected  (PFD) 


Baseline  (AGFA)  42.6 

MIGFA  75.1 


3.2 

21.0 

4.2 

0.0 

58.7 

6.4 

*  As  scored  against  human  interpretations  of  matching  TDWR  data 
**  Expressed  as  a  percent 


by  more  than  50%  the  number  of  fronts  detected  by 
AGFA,  while  decreasing  the  false-alarm  rate.  Simi¬ 
larly,  the  PLD  scores  (column  3)  indicate  an  improve¬ 
ment  in  detection  performance.  The  increase  in  PFD 
(from  12.9%  to  33.4%),  however,  appears  to  suggest 
that  MIGFA  is  not  as  good  as  AGFA  at  discriminat¬ 
ing  the  extent  of  individual  fronts. 

For  a  better  understanding  of  why  MIGFA  was 
extending  fronts  beyond  what  the  human  interpreter 
believed  appropriate,  we  examined  several  cases  in 
which  the  PFD  was  high.  In  most  of  those  cases,  we 
found  the  extra  points  that  MIGFA  included  in  the 
gust  front  detections  were  believable.  For  example, 
Figure  10  shows  a  gust  front  truth  box  that  overlays  a 
MIGFA-generated  detection.  The  human  interpreter 
was  reluctant  to  include  the  extreme  ends  of  the  front 
because  the  ends  were  nearly  radially  aligned  and  had 
weak  reflectivity  values — characteristics  of  out-of-trip 
weather.  However,  because  the  extended  thin  line 
moved  consistently  with  the  center  of  the  front  and 
because  the  variance  of  Doppler  velocity  values  asso¬ 
ciated  with  the  thin  line  was  too  low  to  be  out-of-trip 
weather,  MIGFA  probably  gave  the  more  likely  inter¬ 
pretation  of  the  scene. 

To  substantiate  such  anecdotal  observations,  we 
took  the  gust  fronts  that  MIGFA  and  AGFA  had 
detected  in  ASR-9  WSP  data  and  scored  the  fronts 
against  human  interpretations  of  TDWR  data  that 
had  been  taken  at  the  same  time.  Although  *  v  result¬ 
ing  scores  (Table  2)  support  the  general  trend  of  the 
first  three  columns  of  Table  1,  the  PFD  for  MIGFA 
(6.4%)  is  now  roughly  the  same  as  that  for  AGFA 
(4.2%).  Because  gust  fronts  are  more  readily  observ¬ 


able  in  TDWR  imagery,  we  assume  that  the  TDWR 
truth  (i.e.,  the  TDWR  data  as  interpreted  by  a  hu¬ 
man)  is  more  accurate  than  the  ASR-9  WSP  truth 
(i.e.,  the  ASR-9  WSP  data  as  interpreted  by  a  hu¬ 
man).  Thus  the  difference  between  the  PFDs  as  scored 
against  ASR-9  WSP  and  TDWR  truths  crudely  ap¬ 
proximates  the  percentage  of  detected  gust  front  points 
missed  by  the  human  interpreter.  For  MIGFA,  thir 
difference  (33%  -  6%  =  27%)  added  to  the  PLD 
scored  against  the  ASR-9  WSP  truth  (86%)  is  1 13%; 
i.e.,  MIGFA’s  performance  was  13%  better  than  that 
of  the  human  interpreter.  For  AGFA,  the  comparable 
result  is  1 3%  -  4%  +  39%  =  48%. 

MIGFA  was  installed  at  the  ASR-9  WSP  site  at 
Orlando  International  Airport  in  the  spring  of  1992 
and  was  part  of  a  formal  operational  test  from  8  July 
to  20  September.  During  this  time,  gust  front  detec¬ 
tions  and  predictions  were  relayed  to  air  traffic  con¬ 
trollers  for  their  use  in  planning  air  traffic  operations. 
During  the  early  part  of  the  summer,  several  minor 
problems  and  algorithm  deficiencies  were  identified, 
and  several  fixes  and  enhancements  were  added  dur¬ 
ing  the  middle  of  July.  Careful  interpretation,  or 
“truthing,”  of  the  ASR-9  WSP  data  by  a  human  was 
done  from  1  August  to  20  September. 

As  with  the  off-line  testing  described  earlier,  the 
on-line  performance  wts  scored  against  human  inter¬ 
pretations  of  the  same  data.  Table  3  shows  the  per¬ 
formance  statistics  for  the  test  period.  In  general,  the 
on-line  test  results  substantiate  the  off-line  re¬ 
sults.  Not  surprisingly,  the  POD  (75%)  and  PLD 
(81%)  were  somewhat  lower  than  for  the  off-line 
test  results  shown  in  Table  1 .  Most  of  this  differ- 


VOlUYf  b  NiiVBf  R  ‘  ■»*  13  :f*t  UNCO!  \  1  AB0S*T0R>  J01JR\t 


207 


•  DELANOY  ANO  TROXEL 

M.ul'i'u  hu>t  I  rani  Hi  t,,  nun 


Table  3.  MIGFA  Results*  on  ASR-9  WSP  Data 

Gust  Fronts  Gust  Front  Length 

Probability  of  Probability  of  a  Percent  Length  Percent  False  Length 

Detection  (POD)**  False  Alarm  (PFA)**  Detected  (PLD)  Detected  (PFD) 

MIGFA  75.4  1.8  80.8  21.1 

*  Results  are  scored  against  human  interpretations  of  the  same  ASR-9  WSP  data 
**  Expressed  as  a  percent 

Note:  The  data  are  for  the  period  1  August  to  20  September  1992  in  Orlando,  Florida 


cnee  can  be  explained  by  two  problems. 

First,  several  gust  fronts  had  reflectivity  values  at  or 
below  the  sensitivity  limits  of  the  ASR-9.  Of  course, 
those  fronts  with  reflectivity  values  below  the  ASR-9 
limits  were  missed  by  both  MICIFA  and  the  human 
interpreter.  But  there  were  a  few  cases  of  marginal 
contrast  in  which  the  human  could  detect  a  gust  front 
while  MICiFA  had  not  accumulated  enough  confi¬ 
dence  to  declare  an  alarm.  Note  that,  unlike  MICIFA, 
the  human  interpreter  had  the  opportunity  to  exam¬ 
ine  the  sequence  of  radar  images  repeatedly  and  could 
use  information  from  scans  late  in  the  sequence  to 
confirm  or  deny  the  existence  of  the  gust  front  in 
early  scans.  Not  much  can  be  done  to  overcome  the 
sensitivity  limits  of  the  ASR-9.  In  most  (but  not  all) 
cases,  however,  gust  fronts  with  marginal  reflectivity 
levels  were  associated  with  weak  wind  shears.  Because 
these  weak  fronts  had  a  minimal  impact  on  airport 
operations,  a  failure  to  detect  such  fronts  was  not  a 
significant  liability. 

The  second  problem  was  that  several  gust  fronts 
were  missed  due  to  obscuration.  In  these  cases,  storm 
regions  or  out-of-trip  weather  were  extensive  enough 
to  hide  or  fragment  the  thin-line  signatures  so  that 
some  gust  fronts  were  detected  late,  dropped  early,  or 
sometimes  missed  altogether. 

The  PFA  (1 .8%)  represents  19  false  detections  out 
of  1080  total  detections  generated  by  MICIFA  in 
more  than  14,000  scans  processed.  I  he  high  PFD 
(21.1%)  is  almost  entirely  the  result  of  MICIFA’s  ex¬ 
tending  gust  fronts  beyond  the  ends  delimited  by  the 
human  interpreter.  With  the  use  of  anticipation  based 
on  prior  tracking  data,  MICIFA  was  able  to  extend  the 


detected  gust  front  lengths  through  areas  where  the 
signatures  appeared  ambiguous.  As  was  seen  with  the 
off-line  testing  described  earlier,  a  case-bv-case  analy¬ 
sis  indicates  that  most  of  these  extensions  were  in  fact 
justified  even  though  they  were  inappropriately  scored 
as  false  lengths.  Rescoring  the  results  against  TDWR 
data  should  improve  the  PFD  score. 

Another  way  to  assess  detection  performance  is  to 
score  only  those  gust  fronts  which  had  an  impact  on 
airport  operations.  From  20  July  to  20  September, 
14  convergent  wind  shears  of  greater  than  1  5  kn  were 
recorded  on  the  anemometer  network  at  the  airport. 
Two  of  the  wind  shears  were  the  result  of  short-lived 
localized  winds  beneath  storm  regions  that  were  di¬ 
rectly  over  the  airport.  The  cause  of  a  third  wind  shear 
could  not  be  determined  for  certain,  but  was  prob¬ 
ably  due  to  a  microburst  that  was  reported  at  the 
south  end  of  the  airport  just  as  the  wind  shear  was 
recorded,  in  none  of  these  three  instances  could  hu¬ 
man  interpreters  find  evidence  of  gust  fronts  in  the 
ASR-9  data. 

Of  the  1 1  remaining  wind  shears,  which  were  all 
verified  later  as  gust  fronts  by  human  interpreters, 
MICIFA  correctly  tracked  eight  at  least  up  to  (but  not 
always  over)  the  airport.  In  the  eight  cases,  air  traffic- 
controllers  were  given  initial  warnings  from  18  to  79 
min  prior  to  the  arrival  of  the  front.  Of  the  three 
missed  gust  fronts,  one  was  occluded  by  fast-moving 
storm  regions  that  were  trailing  the  front.  The  second 
missed  gust  front  had  a  very  weak  fragmented  thin- 
line  signature  that  was  missed  both  by  MIGFA  and 
the  human  operators  at  the  radar  site  who  were  log¬ 
ging  weather  and  system  activity.  The  third  missed 


208 


•  DELANO Y  AND  I  ROXEL 

Machine  Intelligent  dust  From  Detection 


case  was  a  young  gust  front  that  had  been  generated 
by  a  large  microburst  only  5  km  away  from  the  run¬ 
ways.  Because  of  its  youth,  the  gust  front  had  not  yet 
developed  a  thin-line  signature.  Human  interpreters 
who  studied  the  radar  scans  after  the  testing  was 
completed  could  find  no  evidence  of  this  particular 
gust  front  in  the  ASR-9  sX'SP  data,  but  could  see  a 
small  zone  of  convergence  without  a  corresponding 
thin-line  signature  in  the  data  from  TDWR.  In  sum¬ 
mary,  although  MIG  FA  correctly  detected  and  tracked 
(up  to  the  airport)  8  out  ot  1 1,  or  73%,  of  the  gust 
fronts  that  had  an  impact  on  airport  operations  (wind 
shear  >  1 5  kn),  human  operators  working  at  the  radar 
site  were  able  to  log  9  out  of  1 1 ,  or  82%,  of  the  same 
gust  fronts. 

False  gust  front  detections  that  are  reported  to  be 
approaching  an  airport  can  also  adversely  affect  air¬ 
port  operations.  If  a  false  alarm  were  trusted,  inappro¬ 
priate  changes  in  airport  operations  planning  might 
be  made  and  the  resulting  delays  could  be  just  as  bad 
as  when  a  gust  front  is  missed.  During  the  test  period, 
three  incoming  events — covering  a  combined  time  of 
24  min  (12  scans) — were  scored  as  false  alarms.  Only 
one  event  generated  a  false  wind-shear  hazard  alert 
(wind  shear  >  15  kn).  All  three  were  probably  the 
result  of  thin  lines  from  stratiform  rain.  None  of  these 
false  alarms  should  have  influenced  airport  operations 
planning  because  in  each  case  tracking  was  dropped 
when  the  estimated  time  of  arrival  at  the  airport  was 
more  than  40  min. 

Evaluation 

Using  the  same  input  ASR-9  WSP  data,  we  have 
shown  by  direct  comparison  that  MIGFA  provides  a 
substantial  improvement  over  AGFA  in  detection  per¬ 
formance.  We  have  also  provided  indirect  evidence 
suggesting  that,  given  the  same  input  data,  MIGFA 
may  be  nearly  as  good  as  human  interpreters.  How¬ 
ever,  the  absolute  reported  POD  scores  for  MIGFA 
(88%  when  scored  against  ASR-9  truth  and  75% 
when  scored  against  TDWR  truth)  are  potentially 
misleading  and  should  be  regarded  with  caution  be¬ 
cause  the  dataset  used  for  comparison  testing  was 
relatively  small  and  from  only  one  season  at  one  site. 
Thus  the  off-line  test  probably  did  not  contain  a  good 
representative  sampling  of  gust  fronts.  The  test  did. 


however,  provide  a  reasonable  basis  for  comparing 
MIGFA  against  the  older  algorithm. 

The  results  for  the  operational  test  period  should 
be  more  representative  of  MIGFA  performance.  In 
the  on-line  testing,  the  POD  and  PLD  scores  re¬ 
mained  high  (in  fact,  the  scores  were  onlv  somewhat 
lower  than  those  reported  for  the  off-line  testing),  but 
an  apparent  problem  in  the  relatively  high  PFD  score 
(21%)  persisted.  Again,  as  was  shown  in  the  initial 
off-line  testing,  many  of  the  false  detections  were  in 
fact  weak  gust  fronts  or  parts  of  gust  fronts  that  the 
human  interpreter  had  overlooked.  Although  these 
results  have  not  been  rescored  against  TDWR  truth, 
the  existences  of  gust  fronts  were  established  for  sev¬ 
eral  cases  by  the  examination  of  matching  TDWR  or 
anemometer  data. 

An  analysis  of  results  accumulated  during  the  1992 
operational  test  period  has  identified  three  main  classes 
of  failure  modes  for  the  ASR-9  WSP  version  of 
MIGFA.  The  failures  within  the  first  class  are  a  direct 
result  of  the  limited  sensitivity  of  the  ASR-9.  Some- 
gust  fronts  that  were  visible  in  TDWR  data  and  that 
had  an  impact  on  the  Orlando  airport  with  moderate 
wind  shear  had  reflectivity  returns  below  the  sensitiv¬ 
ity  of  the  ASR-9.  Like  MIGFA,  experienced  human 
observers  using  ASR-9  data  did  not  see  such  gust 
fronts,  although  with  the  benefit  of  hindsight  the 
observers  could  sometimes  detect  above-threshold  frag¬ 
ments  of  what  must  have  been  the  approaching  front. 
In  general,  gust  fronts  with  thin-line  signatures  that 
have  reflectivity  levels  at  or  below  the  sensitivity  limits 
of  the  ASR-9  usually  (but  not  always)  exhibit 
weak  wind  shears,  making  them  operationally  less 
significant. 

The  second  failure  mode  was  due  to  a  lack  of 
reliable  Doppler  estimates  of  velocity  in  clear  air. 
Because  of  the  unreliability  of  these  values,  the  ASR-9 
version  of  MIGFA  had  to  rely  on  thin-line  signatures 
for  derecting  gust  fronts.  As  discussed  earlier,  how¬ 
ever,  not  all  thin  lines  are  caused  by  gust  fronts.  For 
example,  elongated  low-reflectivity  storm  echoes  as¬ 
sociated  with  extensive  areas  of  stratiform  rain  mov¬ 
ing  with  the  ambient  wind  were  a  source  of  false 
alarms  in  the  operational  testing.  Because  the 
reflectivity  levels  of  light-rain  echoes  overlap  with  the 
range  of  reflectivity  levels  exhibited  b>  gusi  fronts,  the 


VOtll'.'f  *  Ml'.'BiS  ■  ' a Q ;H  THE  1'HCOIS  UBOBOOR.  JOURVU 


209 


•  DELANOY  AND  TROXEL 

Machine  Intelligent  Gust  Front  Detection 


thin-line  feature  detectors  produced  high  interest  val¬ 
ues.  In  most  of  these  cases,  the  thin-line  features 
associated  with  the  stratiform  rain  were  transient  and 
did  not  accumulate  enough  confidence  through  time 
for  the  system  to  declare  a  gust  front.  Some  false 
alarms  could  be  dismissed  because  of  the  lack  of 
implicit  convergent  wind  shears,  which  were  com¬ 
puted  by  comparing  the  radar-measured  winds  in 
incoming  candidate  gust  fronts  with  the  winds  mea¬ 
sured  by  the  airport  anemometers  surrounding  the 
radar  site.  In  at  least  one  case,  however,  a  false  alarm 
could  not  be  rejected  with  this  criterion.  The  winds  at 
the  airport  were  variable  and  not  representative  of  the 
winds  immediately  in  front  of  the  feature,  which  was 
1 5  km  away  from  the  airport. 

The  third  failure  mode  was  caused  by  obscuration. 
During  the  1992  operational  test  period,  several  gust 
fronts  were  either  detected  late,  prematurely  lost,  or 
not  detected  at  all  due  to  obscuration  by  patches  of 
high  reflectivity  that  were  caused  by  storms,  range- 
ambiguous  echoes,  or  ground  clutter.  Even  in  places 
where  the  thin-line  features  were  visible,  such  patches 
of  high  reflectivity  had  sometimes  fragmented  the 
features  into  short  segments.  One  missed  gust  front  is 
known  to  have  had  an  impact  on  the  airport  with  a 
wind  shear  greater  than  15  kn. 

Experience  gained  from  the  operational  test  period 
has  led  to  the  implementation  of  a  partial  solution  to 
the  obscuration  problem.  The  solution  uses  anticipa¬ 
tion  and  the  system’s  ability  to  detect  obscuring  weather 
patterns.  Given  a  sequence  of  images,  there  often 
exists  some  time  interval  when  a  significant  part  of 
the  gust  front  is  not  obscured  and  tracking  can  be 
initiated.  Once  sufficient  confidence  has  accumulated, 
the  system  begins  to  anticipate  where  the  gust  front 
ought  to  be  in  the  next  scan.  In  normal  operation,  the 
thin  lines  of  increased  interest  in  the  anticipation 
interest  image  are  used  to  boost  weak  signals  that 
would  otherwise  be  below  threshold  for  detection, 
(During  the  operational  testing,  obscuration  sup¬ 
pressed  all  interest,  eliminating  any  signals  for  antici¬ 
pation  to  confirm.)  In  the  modified  system,  when 
obscuring  weather  is  found  to  overlap  the  anticipated 
gust  front  locations,  the  anticipation  interest  values 
can  be  increased  to  a  level  at  which  detection  is  trig¬ 
gered  regardless  of  how  weak  the  other  evidence  is.  In 


other  words,  when  obscuration  is  detected,  the  antici¬ 
pation  interest  image  becomes  absolute,  resulting  in 
spatially  restricted  coasting. 

Summary 

The  identifying  signatures  for  gust  fronts — thin  lines 
of  increased  reflectivity,  boundaries  of  converging 
Doppler  values,  and  motion  perpendicular  to  the  thin 
lines  and  convergence  boundaries — are  conceptually 
easy  to  define  and  exploit  as  the  basis  of  detection 
algorithms.  And  yet,  although  several  research  groups 
have  worked  collectively  for  nearly  10  years  to  de¬ 
velop  reliable  automatic  gust  front  algorithms,  none 
of  the  algorithms  has  demonstrated  performance  com¬ 
parable  to  the  ideal  of  human  performance. 

The  problem  is  that  automatic  gust  front  detec¬ 
tion,  like  other  applications  in  computer  vision,  is 
deceptively  much  more  difficult  than  the  task  of  sim¬ 
ply  finding  one  or  more  signatures.  Human  observers 
use  a  variety  of  perceptual  skills  that  have  been  noto¬ 
riously  and  surprisingly  difficult  to  implement  in  com¬ 
puter-vision  systems.  For  example,  humans  have  a 
talent  for  dealing  with  uncertain,  ambiguous,  and 
even  contradictory  evidence.  Humans  use  specific 
knowledge  of  the  object  being  sought  and  the  context 
of  observation  as  well  as  the  object’s  spatial  and  tem¬ 
poral  context.  Unlike  most  other  computer-vision 
and  automatic  target  recognition  (ATR)  methodolo¬ 
gies,  the  Experimental  Target  Recognition  System 
(XTRS)  and  the  Machine  Intelligent  Gust  Front  Al¬ 
gorithm  (MIGFA)  do  not  rely  on  machine  intelli¬ 
gence  only  at  the  higher  symbolic  levels  of  processing. 
XTRS  provides  a  framework  for  applying  knowledge 
at  the  level  of  raw  data  by  using  specialized  techniques 
for  knowledge-based  signal  processing  and  pixel-level 
processing  of  evidence.  The  fact  that  MIGFA  perfor¬ 
mance  is  competitive  with  that  of  human  observers  is 
at  least  partially  due  to  this  use  of  low-level  machine 
intelligence. 

Acknowledgments 

The  authors  would  like  to  thank  the  staff  of  the  FL-3 
ASR-9  radar  site  in  Orlando,  Florida,  who  were  re¬ 
sponsible  for  running  MIGFA,  recording  data,  and 
logging  results  during  the  operational  test  period  in 
1992.  These  people  include  Wes  Johnston,  Craig 


210  rut  imcotK  UBORAiORf  mmi  voiumi  «  *b*»h>  t.  ts»3 


•  DELANOY  AND  TROXEL 

Machine  Intelligent  Gust  front  Detection 


McFarland,  Jell  Boisseau,  and  Cindy  Meuse.  The 
authors  would  also  like  to  thank  Joe  Cullen  for  gener¬ 
ating  “truth”  Irom  ASR-9  WSP  data,  against  which  all 
algorithm  results  have  been  scored.  XTRS,  the  proto¬ 
type  object  recognition  system  upon  which  M1GFA 
is  based,  has  evolved  over  several  years  of  collabora¬ 
tion  with  Jacques  Verly  and  Dan  Dudgeon. 

This  work  was  sponsored  by  the  Federal  Aviation 
Administration. 


REFERENCES 

1.  J.G.  Verly,  R.L.  Delanoy,  and  D.F..  Dudgeon.  'Machine  In¬ 
telligence  Technology-  lor  Automatic  l  arger  Recognition, 
Line.  Lab.  J.  2,  277  (1984). 

2.  H.  Uveda  and  D.S.  Zrnic,  “Automated  Detection  ol  Gust 
Fronts,”  /  Atmos.  Oceanic  Tech.  3,  36  (1986). 

3.  S.  Smith,  A.  Witt,  M.  F.iks,  L.  Hermes,  D.  Klingle-Wilson,  S. 
Olson,  and  J.P.  Santord,  "Gust  Front  Detection  .Algorithm  tor 
the  Terminal  Doppler  Weather  Radar  Part  I:  Current  Status,' 
Troc.  3rd  Inti.  Conj.  on  the  Aviation  Weather  System.  Anaheim. 
CA.  Jan.  1989.  p.  31. 

4.  L.  Hermes,  A.  Witt.  S.  Smith,  D.  Klingle- Wilson,  D.  Morris, 
G.  Stump!,  and  M.  Eilts,  “The  Gust  Front  Detection  and 
Wind  Shift  Algorithms  lor  the  Terminal  Doppler  Weather 
Radar  System,”  /  Atmos.  Oceanic  Tech,  (to  be  published). 

5.  M.  Eilts,  S.  Olson,  G.  Stumpf,  I..  Hermes,  A.  Abrevaya,  ). 
Culbert,  K.  I  homas,  K.  Hondl,  and  D.  Klingle- Wilson.  “An 
Improved  Gust  Front  Detection  Algorithm  tor  the  TDW'R.” 
/‘roc.  4th  Inti.  Conf.  on  the  Aviation  Weather  System.  Paris.  June 
1991.  p.J37. 

6.  M.W.  Merritt ,  D.  Klingle- W'ilson,  and  S.D.  Campbell,  “W  ind 
Shear  Detection  with  Pencil-Beam  Radars,"  Line.  Lab.  /.  2, 
483(1989). 

7.  T.A.  Noyes,  S.W.  Troxel.  M.E.  Weber,  O.J.  Newell,  and 
J.A.  Cullen,  “The  1990  Airport  Surveillance  Radar  Wind 
Shear  Processor  (ASR-W'SP)  Operational  Test  at  Orlando 
International  Airport."  Project  Report  ATC- /  “7?,  MIT  Lincoln 
laboratory  Only  1991),  AD-239852  (NTIS  only). 

8.  M.E.  Weber,  “Airport  Surveillance  Radar  (ASR-9)  W'ind  Shear 
Processor:  1991  Test  at  Orlando,  FL,"  Project  Report  A  TC- 
189,  MIT  Lincoln  Laboratory  (June  1992),  AD-252246 
(NTIS  only). 

9.  R.L.  Delanoy,  J.G.  Verly,  and  D.E.  Dudgeon,  “Pixel-Level 
Fusion  Using  Interest  Images,"  Technical  Report  97 9,  MI  L 
L  incoln  laboratory  (26  Apr.  1993). 

10.  S.  Grossbergand  E.  Mingolla,  “Neural  Dynamics  ot  Percep¬ 
tual  Grouping:  Textures,  Boundaries,  and  Emergent  Segmen¬ 
tations,”  Perception  and Pychophysics 38,  no.  2,  1 4 1  (1985). 

11.  S.  Levialdi,  “Parallel  Paitern  Processing,"  IEEE  Tram.  Sysr. 
Man  Cybem.  1,  292  (1971). 

12.  D.L.  Klingle- Wilson,  M.F.  Donovan,  S.H.  Olson,  and  F.W. 
Wilson,  “A  Comparison  ot  the  Performance  ot  Two  Gust 
Front  Detection  Algorithms  Using  a  Length-Based  Scoring 
Technique,"  Project  Report  ATC- 1 85.  MIT  Lincoln  labora¬ 
tory  (May  1992). 


VOLIJVJ  6  NUVBtR  7  * »93  (HE  tl\C01S  ItBOtU'OS.  jOIHMI 


21 


•  DELANOY  AND  TROXEL 

Machine  hutlhgeni  (nut  front  Detection 


RICHARD  L.  DELANOV 

is  a  staiii  member  ot  the  Ma¬ 
chine  Intelligence  Technology 
Group.  His  work  spans  the 
fields  of  computer  vision, 
machine  learning,  and  con¬ 
struction  of  object-recognition 
systems.  From  1 980  to  1 983, 
he  was  a  research  scientist  at 
the  University  of  Virginia 
Department  of  Psychology, 
where  he  investigated  the 
biochemical  correlates  ot 
learning  and  the  effects  of 
stress-related  hormones  on 
electrophysiological  models  of 
memory.  Before  joining  Lin¬ 
coln  Laboratory  in  1987,  he 
worked  for  GE  Fanuc  Automa¬ 
tion  N.A.,  Inc.,  as  a  software 
engineer  developing  numerical 
and  programmable  controllers 
for  manufacturing  automation. 
Dick  received  a  B.A.  degree  in 
biology  from  Wake  Forest 
University  in  1973,  a  Ph.D. 
degree  in  neuroscience  from 
the  University  of  Florida  Col¬ 
lege  of  Medicine  in  1979,  and 
an  M.S.  degree  in  computer 
science  from  the  University  of 
Virginia  in  1987.  He  was  a 
National  Science  Foundation 
Predoctoral  Fellow  and  a 
National  Institute  of  Mental 
Health  Postdoctoral  Fellow. 


SETH  W.  TROXEL 
received  a  B.S.  degree  in  me¬ 
teorology  Irom  San  Jose  State 
University,  California,  in  1983 
and  went  on  to  work  as  a 
meteorologist  in  the  Atmo¬ 
spheric  Lidar  Group  of  the 
NOAA  Wave  Propagation 
Laboratory  in  Boulder,  Colo¬ 
rado.  In  1987,  he  joined  Lin¬ 
coln  Laboratory  as  a  software 
engineer  and  meteorologist 
with  the  Weather  Sensing 
Group.  Since  coming  to  Lin¬ 
coln  Laboratory,  he  has  been 
involved  in  the  testing  and 
development  of  hazardous- 
weather-detection  capabilities 
for  the  Airport  Surveillance 
Radar  (ASR-9)  and  Terminal 
Doppler  Weather  Radar 
(TDWR).  Seth’s  primary 
research  interests  are  in  the 
areas  of  remote  sensing,  soft¬ 
ware  engineering,  and  design  ot 
algorithms  for  aviation  weather 
products. 


212  IHE  LINCOLN  LABORATORY  JOURNAL  VOLUME  6  NUMBER  I.  1993 


Extracting  Target  Features 
from  Angle-Angle  and 
Range-Doppler  Images 

Su  May  Hsu 

M  For  diffuse  targets,  features  such  as  shape,  size,  and  motion  can  be 
determined  from  a  time  series  of  images  from  either  angle-angle  passive 
telescopes  or  range-Doppler  radars.  The  extracted  target  features  can  then  be 
used  for  automated  target  recognition  and  identification. 

An  algorithm  that  uses  scene-analysis  techniques  has  been  developed  to 
perform  the  feature  extraction.  The  algorithm  first  processes  the  images  to 
suppress  noise,  then  applies  a  two-dimensional  slope  operation  for  edge 
detection  to  determine  the  target  boundaries.  Next,  Hough  transforms  arc  used 
on  the  target  edges  to  detect  straight  lines  and  curves,  which  are  subsequently 
refined  with  line  and  curve  fits.  Groups  of  the  fitted  lines  are  then  examined  to 
form  cylinders  and  cones  representing  typical  target  components.  After  these 
shapes  have  been  identified,  the  target  configuration,  size,  location,  and  attitude 
can  be  estimated.  The  target  motion  can  then  be  inferred  from  a  time  series  of 
attitudes  that  have  been  extracted  from  a  sequence  of  images. 


For  a  target  with  rough  surfaces,  electromag¬ 
netic  signals  are  reflected  and  returned  from 
scatterers  that  are  distributed  over  the  entire 
target  surface.  The  resulting  imagery,  whether  angle- 
angle  (passive  telescopes)  or  range-Doppler  (radar), 
will  show  a  diffuse  object  with  surface  returns  that 
became  apparent  along  the  sensor  line  of  sight  (LOS). 
From  such  images,  the  target  shape,  size,  and  orienta¬ 
tion  can  be  determined  from  pattern  recognition  and 
identification  techniques.  And,  with  a  time  sequence 
of  images,  the  motion  of  the  target  can  be  estimated 
from  its  orientation  history. 

In  the  feature-extraction  algorithm  developed  at 
Lincoln  Laboratory,  the  images  are  first  processed  to 
suppress  noise  and  to  smooth  the  image  surface.  Edge 
detection  is  then  performed  to  determine  the  target 
boundaries,  and  the  detected  edge  points  are  inte¬ 
grated  into  line  segments  to  form  target  shapes.  Next, 
target  dimensions  and  orientations  are  measured  from 


the  line  representation  of  the  target,  which  is  assumed 
to  be  axisymmetric. 

In  this  article,  examples  of  target  feature  extraction 
are  demonstrated  for  both  angle-angle  and  range- 
Doppler  images.  For  angle-angle  images,  target  orien¬ 
tation  is  obtained  from  the  projected  elliptical  shape 
of  circular  components  and  the  projected  length  of  a 
symmetric  body  axis.  For  range-Doppler  images,  tar¬ 
get  dimensions  are  first  determined  from  a  sequence 
of  range  and  Doppler  extents  that  have  been  extracted 
from  the  images.  The  target  aspect-angle  history  can 
then  be  derived  from  that  same  sequence  of  range  and 
Doppler  extents  (as  for  angle-angle  images)  by  using 
the  estimated  target  dimensions. 

Image  Scene  Analysis 

The  goal  of  target  feature  extraction  is  to  obtain  shape 
information.  From  such  information,  the  size,  orien¬ 
tation,  and  position  of  a  target  can  be  estimated. 


VOLUME  6  NUMBER  1  1993  THE  IIVCOIN  LABORft 1 0R¥  JOURNAL  213 


Image 

processing 


Edge 

detection 


Linear  Hough  transform 
(classify  lines) 

Single 

lines 

Parallel 

lines 

Intersecting 

lines 

Shape  projections 
(classify  forms) 

Cone 

(half-angle) 

Axis  of 

Cylinder 

(radius) 

symmetry 

Elliptical 
Hough 
transform 
(determine 
aspect  angle) 


Features 
(aspect  angle, 
base  radius, 
and  length) 


FIGURE  1.  Feature-extraction  process. 


Shapes  are  composed  of  lines  and  curves  that  are 
collections  of  boundary  edge  points.  Such  edge  points 
can  be  located  with  edge-detection  techniques.  The 
detected  edge  points  can  then  be  associated  with  lines 
and  curves  by  using  predetermined  fitting  constraints. 
For  the  current  application,  cylindrical  and  conical 
shapes  are  considered,  and  Hough  transforms  are  used 
to  detect  the  presence  of  lines  and  curves  [1  ]. 

Figure  1  illustrates  the  feature-extraction  process. 
First,  image  processing  is  performed  to  enhance  the 
edge-detection  process.  The  image  processing  includes 
the  application  of  median  filters  to  remove  isolated 


noise,  the  averaging  of  multiple  image  frames  to  en¬ 
hance  the  image  signals,  and  the  spatial  smoothing  of 
the  image  surface.  Thresholds  are  then  applied  to 
remove  the  image  background.  Next,  a  linear  Hough 
transform  is  used  to  detect  and  collect  lines,  which  are 
later  examined  to  form  the  boundaries  of  cylinders 
and  cones.  Fhe  position  and  orientation  of  the  axis 
of  symmetry  of  each  detected  shape  are  then  deter¬ 
mined.  For  angle-angle  imagery,  an  elliptical  Hough 
transform  is  generally  used  to  detect  and  fit  the  tar¬ 
get-base  curve  to  allow  for  the  subsequent  calculation 
of  the  target  aspect  angle.  For  range-Dopplcr  imagers-. 


214 


•  HSU 

l  .xirih  ini g  lurot'i  I  t  uiun^  from  Anglf-. X/iglt  jtui  Range  Doppler  hnagt> 


an  elliptical  Hough  transform  is  used  to  determine 
the  target  dimensions,  and,  hence,  to  obtain  the  tar¬ 
get  aspect  angle. 

1  he  three  mam  steps  of  scene  analysis  for  the 
extraction  of  target  features  are  edge  detection,  line 
identification,  and  curve  extraction.  These  three  steps 
are  described  in  the  following  subsections. 

Fdge  Detection 

I  he  boundaries  of  objects  in  an  image  exist  at  loca¬ 
tions  where  the  image  values  change  abruptly.  I  hese 
abrupt  changes  can  be  detected  bv  a  spatial  difference 
operator  [2].  1  he  design  of  such  an  operator  depends 
closely  on  the  quality  and  complexity  of  the  image, 
and  on  the  desired  level  of  feature  extraction.  The 
current  application  uses  spatial  gradient  operators  [31 . 

The  operator  center  is  placed  at  an  image  location, 
and  the  result  of  the  convolution  operation  on  the 
image  values  represents  the  local  gradient  at  that  im¬ 
age  location  and  in  the  operators  direction.  The  edge 
value  at  the  image  location  /  (.v,  y)  can  be  expressed  as 

(V[/(.v,  y)]  =  ,  6';  +  (/;  . 

where  (i\  and  (/  are  the  image  gradients  in  the  hori¬ 
zontal  and  vertical  directions,  respectively.  The  edge  is 
in  the  gradient  direction,  which  can  be  obtained  as 


where  0  is  measured  with  respect  to  the  .v-axis. 


(iradient  operators  of  size  5,  as  shown  in  Iable  1, 
are  used.  In  general,  the  operator  size  should  be  de¬ 
creased  tor  coarser  image  resolutions  because,  as  the 
resolution  becomes  coarser,  the  object  boundaries  be¬ 
come  closer  to  each  other. 

I  he  calculated  edge  values  are  small  tor  a  smooth 
image  surface  and  large  tor  a  discontinuous  surface. 
Thus  edge  points  can  be  detected  bv  applying  a  thresh¬ 
old  to  the  edge  values.  1  he  threshold  can  be  varied, 
depending  on  the  level  of  detail  desired  tor  edge 
extraction.  I  lie  examples  presented  in  this  article  use 
a  threshold  equal  to  10°o  of  the  dvnamic  range  of  the 
edge  values,  i.e.,  GniII1  +  0.10((/nm  -  (/„„„).  The 
choice  of  the  threshold,  bo  ever,  can  be  optimized 
with  respect  to  the  histogram  of  the  edge  values. 
Because  of  the  spatial  extent  of  the  operator,  the  edge 
values  obtained  after  the  application  of  a  threshold 
generally  are  thickly  populated  near  or  at  target 
discontinuities.  Thus  a  procedure  for  noil-maxima 
suppression  can  be  applied  for  the  further  thinning  of 
these  edges  [4],  In  the  procedure,  the  edge  value  at  a 
point  is  set  to  zero  it  the  value  is  not  the  local  maxi¬ 
mum  in  the  direction  perpendicular  to  the  edge  direc¬ 
tion.  Figure  2  shows  an  example  of  the  detected  edge- 
points  superimposed  over  a  gray-scaled  image.  The 
edge  points  include  the  outline  of  the  cones  and  the 
silhouette  of  the  cylinder  against  the  background. 

Line  Identification 

After  the  edge  points  have  been  detected,  they  must 
be  associated  with  lines  and  curves.  For  simple  closed 
contours,  the  edge  points  can  be  chain  coded  [S],  and 


Table  1.  5  x  5  Gradient  Operators 


x- Direction 

y-Direction 

-2 

-2 

-2 

-2 

-2 

-2 

-1 

0 

1 

2 

-1 

-1 

-1 

-1 

-1 

-2 

-1 

0 

1 

2 

0 

0 

0 

0 

0 

-2 

-1 

0 

1 

2 

1 

1 

1 

1 

1 

-2 

-1 

0 

1 

2 

2 

2 

2 

2 

2 

-2 

-1 

0 

1 

2 

•  HSll 

Ixlnii  liirgti  beiitures  from  Angle-Angle  unU  Nange-Doppler  Images 


FIGURE  2.  Gray-scaled  image  with  detected  edge  points. 

the  shape  description  can  be  performed  with  syntactic 
pattern  grammars  [6],  The  current  application  con¬ 
siders  images  of  complex  objects  comprising  a  combi¬ 
nation  of  shapes,  from  which  broken  edges  and  edges 
inside  other  object  boundaries  are  permitted.  Thus  a 
layered  approach  for  shape  formation  is  required. 

Hough  transforms  can  be  used  to  detect  the  lines 
and  curves  in  an  image.  In  Figure  3(a),  a  line  in  two- 
dimensional  (2-D)  space  is  represented  by  the  di¬ 
rected  orthogonal  distance  h  to  the  origin  and  the 
angle  0  that  h  forms  with  the  x-axis.  Any  point  (x,  y) 
on  the  line  will  satisfy  the  following  equation: 

h  =  x  cos  H  +  y  sin  0 . 

The  set  of  lines  passing  through  a  given  point  (x,  y) 
can  be  plotted  in  h~0  space.  The  result  will  be  a 
sinusoidal  curve,  as  shown  in  Figure  3(b).  Note  that 


the  h  curves  of  two  points  in  x-y  space  will  intersect 
periodically  in  b-t)  space,  and  the  points  of  intersec¬ 
tion  (spaced  -t  radians  apart)  will  correspond  to  the 
line  defined  by  the  rwo  points  in  x-y  space.  Thus,  it 
the  h  curves  of  all  edge  points  in  an  image  are  plotted 
in  1x0  space,  the  points  at  which  the  h  curves  intersect 
will  correspond  to  prominent  lines  in  the  image. 

In  practice,  the  1x0 space  is  divided  into  accumula¬ 
tor  cells,  and  the  cell  value  of  a  particular  cell  is 
increased  by  1  each  time  a  curve  passes  the  cell’s 
location.  Because  the  direction  of  each  edge  point  is 
known  from  the  edge-detection  process  (discussed 
earlier),  only  the  portion  of  the  h  curve  for  angles 
around  that  direction  needs  to  be  searched.  A  thresh¬ 
old  is  chosen  at  the  half-length  (in  pixels)  of  the 
shortest  line  expected  to  be  extracted  from  the  image. 
Only  those  accumulator  cells  whose  cell  values  arc- 
greater  than  the  threshold  are  selected  for  further 
investigation.  The  selected  cells  represent  a  set  of  lines 
detected  from  the  image. 

For  each  detected  line,  a  collection  of  points  that 
lie  on  the  line  is  gathered  from  the  edge  points.  To 
represent  the  location  of  a  line  in  the  overall  image, 
the  mean  of  the  collected  points,  i.e. ,  the  mean  (x,  y) 
value,  is  used.  For  cases  in  which  several  line  segments 
from  different  image  locations  happen  to  be  collinear, 
the  line  segments  will  be  represented  by  just  single 
line  values  of  /;and  0.  In  such  cases,  additional  lines — 
with  the  same  line  values  h  and  0  but  with  different 
mean  (x,  y)  values — are  added  to  the  line  set  to  repre¬ 
sent  the  different  line  segments.  The  line  set  is  exam¬ 
ined  further  to  eliminate  those  lines  which  have  mean 
(x,  y)  values  and  h  and  0  values  that  are  close  to  other 
lines.  The  line  set  is  then  used  as  a  set  of  seeds  for 
growing  connected  line  segments  with  a  relaxation 
process,  as  shown  in  Figure  4. 

During  each  iteration  of  the  relaxation  process, 
each  edge  point  in  the  image  is  classified  in  associa¬ 
tion  with  one  of  the  detected  lines.  The  edge  point  is 
then  added  to  the  point  collection  of  the  associated 
line.  When  a  point  is  not  close  enough  to  any  of  the 
lines  (the  allowable  orthogonal  distance  for  point  clas¬ 
sification  should  be  less  than  half  the  distance  of  the 
closest  distinctive  feature  lines  in  the  image)  or  when 
the  edge  direction  does  not  agree  with  the  line  direc¬ 
tion  (the  allowable  angular  deviation  should  be  less 


216  tHl  LINCOLN  LABORATORY  JOURNAL  VOLUME  6  NUMBER  1.  1993 


Extracting  Target  Features  from  Angle-Angle  and  Range-Doppler  Images 


FIGURE  3.  Hough  transform  for  straight  lines:  (a)  line  defined  in  x-y  coordinates  and  (b)  Hough  transform  in  h-ti  space.  A 
line  in  two-dimensional  (2-D)  space  can  be  represented  by  the  directed  orthogonal  distance  h  to  the  origin  and  the  angle  0 
that/)  forms  with  the  x-axis,  as  shown  in  part  a.  For  any  given  point,  the  set  of  lines  passing  through  that  point  can  be  plotted 
in  h-B  space.  The  result  will  be  a  sinusoidal  curve,  as  shown  in  part  b.  Note  that  the  h  curves  of  two  points  in  x-y  space  will 
intersect  periodically  in  h-ti space.  The  points  of  intersection  (spaced  ;r  radians  apart)  correspond  to  the  line  defined  by  the 
two  points  in  x-y  space. 


than  half  the  smallest  angle  formed  by  intersecting 
lines  in  the  image),  the  point  is  not  classified.  At  the 
end  of  each  iteration,  the  current  classification  is  com¬ 
pared  with  the  results  of  the  previous  iteration.  If  any 
difference  exists,  the  line  set  is  recalculated  from  the 
current  point  collection  and,  with  the  updated  line 
set,  the  point  classification  is  reiterated.  The  relax¬ 
ation  process  stops  when  the  current  classification  has 
not  changed  from  the  previous  iteration.  After  the 
relaxation  process  ceases,  the  lines  are  identified  from 
the  edge  points,  and  the  position  and  orientation  of 
each  line  are  calculated  from  the  point  collection  with 
a  least-squares-error  fit.  This  information  is  then  used 
for  simple  shape  formation:  parallel  lines  are  identi¬ 
fied  for  cylinders,  and  intersecting  lines  are  chosen  for 
cones.  Finally,  the  axis  of  symmetry  is  determined  for 
the  position  and  projected  orientation  of  each  chosen 
shape. 

Curve  Extraction 

The  circular  base  of  a  cylinder  or  cone  generally  ap¬ 
pears  elliptical  in  angle-angle  imagery  because  the 
aspect  angle  between  the  object  body  axis  and  the 


receiver  LOS  is  usually  nonzero.  From  geometry,  the 
ratio  of  the  minor  and  major  axes  of  the  projected 
ellipse  is  the  cosine  of  the  aspect  angle.  The  aspect 
angle,  together  with  the  projected  body-axis  orienta¬ 
tion,  can  be  used  to  determine  the  body  attitude  with 
respect  to  the  sensor  in  3-D  space.  In  extracting  the 
elliptical  base,  it  is  assumed  that  the  straight  bound¬ 
aries  have  already  been  identified.  Thus  the  ellipse 
center  will  lie  on  the  axis  of  symmetry  of  the  shape. 
For  cylinders,  the  major  axis  of  the  base  ellipse  will  be 
half  the  distance  between  the  two  parallel  edges  that 
define  the  shape.  For  cones,  the  major  axis  will  be  the 
distance  from  one  conical  edge  to  the  ellipse  center. 
Thus  the  ellipse-extraction  scheme  first  determines 
the  base  center  along  the  axis  of  symmetry  of  the 
shape  and  then  fits  an  elliptical  curve  at  the  base 
region  to  estimate  the  minor  axis. 

Figure  5(a)  depicts  a  cone  shape  with  an  elliptical 
base.  The  ellipse  function  with  a  coordinate  rotation 
is  expressed  by 

(x  -  x(,)2  i  (y  -  _y„)r2  ] 

*2  b2  ’  ( 


VOLUME  6.  MJVBFR  I  '993  THE  LINCOLN  LABORATORY  JOURNAi. 


217 


■  IlSl! 

/  ////!>  I  f.tlim  t  from  Angle  Altaic  ami  K.iiigt  I  hi/'/'lt  i  lnug,  < 


Line  detection  from 
Hough  transformation 


Shape 

formation 


FIGURE 4.  Relaxation  process  for  line  identification. 

where  (.v(),  y„)  represents  the  ellipse  center  on  the  axis 
of  symmetry  having  orientation  /fr,  and  a  and  b  are 
the  minor  and  major  axes  of  the  ellipse,  respectively. 
Equation  i  can  be  rewritten  tor  the  minor  axis  a: 

|(-v  -  *o)r| 

a  =  — - — — . 

.  _  ( y  -  .Vo)r 

V  '  f 

Ihc  above  equation  indicates  that  the  minor  axis  of 
the  ellipse  is  a  function  of  the  location  of  the  ellipse 
center  (,v(),  _y(|)  along  the  axis  of  symmetry,  as  shown 
in  Figure  5(b).  Thus  the  Hough  transform  space 
\a,  y0)l  can  be  used  for  ellipse  detection  in  which 


values  of  a  are  plotted  for  all  edge  points  in  the  base 
region.  The  maximally  accumulated  cell  in  Hough 
transform  space  will  then  determine  a  and  the  loca¬ 
tion  of  (,v0,  y(|). 

In  addition,  the  edge  detection  of  a  point  on  the 
ellipse  curve  should  be  consistent  with  the  curve  tan¬ 
gent  derived  by  differentiating  Equation  I ,  as  follows: 

..ULtezM, 

thy  -  y (,)r  lr  (x  -  .v„)r 

Thus  each  edge  point  is  checked  with  respect  to  loca¬ 
tion  and  edge  orientation  before  being  added  to  the 
collection  of  points  for  an  extracted  curve.  The  mi¬ 
nor-axis  value  is  then  refined  from  the  best  curve  fit  of 
the  collected  points.  The  results  of  line  and  curve 
identification  are  shown  and  discussed  in  the  follow¬ 
ing  section. 

Feature  Extraction  for  Angle-Angle  Images 

For  angle-angle  imagery,  targets  are  projected  from 
3-D  space  to  a  plane  perpendicular  to  the  sensor 
EOS,  as  shown  in  Figure  6.  Thus  spheres  are  pro¬ 
jected  as  circles,  and  the  circular  bases  of  cones  and 
cylinders  become  ellipses.  In  the  images,  the  physical 
radii  of  the  circular  forms  of  targets  are  generally 
preserved  without  transformations:  the  major  and 
minor  axes  of  the  elliptical  projection  are  respectively 
the  circular  radius  itself  and  the  radius  with  a  cost* 
factor,  <t  being  the  aspect  angle  (i.e.,  the  angle  be¬ 
tween  the  target  body  axis  and  the  sensor  EOS).  The 
boundary  lines  of  targets  are  projected  into  lines  in 
images,  but  their  dimensions  are  generally  transformed 
with  a  factor  of  si  no.  With  the  use  of  such  projection 
relationships  between  target  3-D  space  and  image 
2-D  space,  the  size,  shape,  and  orientation  of  a  target 
can  be  inferred  from  its  image  features. 

Two  examples  of  simulated  angle-angle  images  have 
been  analyzed  to  demonstrate  the  extraction  of  target 
features  from  such  imagery.  First,  a  simulated  image 
of  a  complex  object  is  used  for  size  and  shape  estima¬ 
tion.  Then  a  sequence  of  cone  images  is  employed  for 
motion  extraction. 

Size  and  Shape  Estimation 

The  model  used  in  the  example  has  a  cylindrical  main 


218  1 HF  1  I  SO  l  N  I  ABOKAlOR)  JUlIRMt  VOlllUt  b  MlWBIR  1  MW  3 


•HSU 

Extracting  Target  Features  from  Angle-Angle  and  Range- Doppler  Images 


FIGURE  5.  Elliptical  Hough  transform:  (a)  cone  shape  with  elliptical  base  in  x-y  coordinates  after  a  coordinate  rotation  /tr, 
and  (b)  Hough  transform  in  which  a,  the  minor  axis  of  the  ellipse,  is  plotted  on  the  vertical  axis  and  the  location  of  the  ellipse 
center  (x0,  y0)  along  the  axis  of  symmetry  is  plotted  on  the  horizontal  axis.  For  a  detailed  description  of  the  mathematics 
involved,  see  the  main  text. 


body  with  three  cones  mounted  at  the  forward  end. 
Figure  7(a)  shows  a  simulated  angle-angle  image  of 
the  object  at  a  50°  aspect  angle,  and  Figures  7(b),  (c), 
and  (d)  show  the  results  of  a  scene  analysis  that  was 
performed  with  pixel  sizes  of  2,  8,  and  20  cm,  respec¬ 
tively.  For  each  of  the  scene-analysis  images,  the  edge 
points  have  been  detected  and  lines  identified.  (Note: 
The  different  lines  are  coded  in  different  colors.) 
Ideally,  two  major  parallel  lines  would  be  selected  for 
the  cylindrical  body,  and  three  pairs  of  intersecting 
lines  would  be  chosen  for  the  cones.  Then  an  ellipti¬ 
cal  curve  could  be  fitted  for  the  cylindrical  base  to 
determine  the  aspect  angle.  Because  the  cylindrical 
body  is  large  compared  to  all  three  of  the  pixel  sizes 
used,  the  two  parallel  lines  representing  the  shape  are 
easily  discernible  in  Figures  7(b),  (c),  and  (d).  But  the 
cones,  because  of  their  smaller  size,  appear  distorted 
in  the  images,  particularly  in  Figure  7(d).  Nonethe¬ 
less,  the  cones  are  recognizable  at  pixel  sizes  of  2  and 
8  cm. 

Figure  8  shows  the  quantitative  results.  For  all  pixel 
sizes,  the  estimated  cylinder  radius  (Figure  8[a])  agrees 
well  with  the  model.  Estimations  for  the  cone  radius 
(Figure  8 [b] )  and  projected  cone  angle  (Figure  8[c]) 
are  good  for  pixels  smaller  than  8  cm,  and  the  aspect 


FIGURE  6.  Geometry  of  angle-angle  imagery  of  a  cone. 


VOLUME  6  HUMBER  1  1 993  IHE  IIHCOLN  U80RU0RY  JOURVAl 


219 


•  HSli 

i  into  l I  t . aiitr ,  ft  tun  _  tm'/t  ittu/i  unit  Hanvi  -  / hij'jlt '  huu^t  , 


Elliptical 

Linear 


Elliptical 

Linear 


Elliptical 

Linear 

Elliptical 


FIGURE  7.  Size  and  shape  extraction  for  a  simulated  object  consisting  of  a  cylindrical  main  body 
with  three  cones  me  unted  at  the  forward  end:  (a)  angle-angle  input  image  after  edge  detection, 
(b)  line  classification  at  a  pixel  size  of  2  cm,  (c)  line  classification  at  a  pixel  size  of  8  cm,  and 
(d)  line  classification  at  a  pixel  size  of  20  cm. 


angle  obtained  from  the  elliptical-curve  fit  (Figure 
8 [d ) )  is  also  in  agreement  with  the  model  for  pixels 
smaller  than  8  cm. 

In  general,  curve  fitting  requires  finer  image  qual¬ 
ity  than  line  fitting.  Fines  can  usually  be  detected 
within  one  pixel  to  the  true  edge.  The  pixel  size, 
which  represents  rhe  sampling  size  at  the  image  focal 
plane,  is  determined  by  the  type  of  focal  plane  and 
the  angular  resolution,  the  diffraction  or  resolution 
limit  of  the  imaging  system,  and  the  signal-to-noise 


ratio  for  the  target  at  the  receiver. 

Motion  Extraction 

Target  motion  can  be  used  to  aid  the  target-classifica¬ 
tion  process,  as  has  been  demonstrated  in  range-Dop- 
pler  imagery  from  millimeter-wave  (MMW)  and  other 
microwave  radars.  With  a  time  series  of  target  atti¬ 
tudes  extracted  from  such  imagery,  target  motion  can 
be  inferred.  The  observation  generally  is  more  straight¬ 
forward  in  a  time  series  of  high-resolution  images. 


220 


HI  li'.cni".  UBOBtlOR,  ,ll)ii«\S!  ¥01  ’i'.'l  h  V1MBIH1  ''I'M 


•  HSU 

Extracting  I  arget  [-futures  from  Angle-Angle  ami  Hinge- Doppler  Images 


Resolution  (cm)  Resolution  (cm) 


FIGURE  8.  Feature-extraction  performance  at  various  pixel  sizes  for  the  simulated  model  of 
Figure  7:  (a)  calculated  base  radius  of  the  model,  (b)  calculated  base  radius  of  a  cone,  (c)  calculated 
half-cone  angle  of  a  cone,  and  (d)  calculated  aspect  angle  of  the  model.  (Note:  For  comparison,  the 
input  values  used  in  the  simulation  are  indicated  with  straight  lines.) 


For  purposes  of  demonstration,  a  free-body  spin- 
precession  motion  is  considered.  Figure  9  shows  a 
coordinate  system  with  a  spinning  target  and  the 
body  axis  of  the  target  precessing  around  the  z-axis. 
The  parameters  0p,  <pp,  and  p  represent  the  precession 
half-cone  angle,  the  precessing  azimuthal  angle,  and 
the  angle  between  the  LOS  and  the  z-axis,  respec¬ 
tively.  Suppose  that  p  and  flp  are  constant  during  an 
observation.  Then  the  aspect  angle  a  between  the 
body  axis  and  the  LOS  will  be  a  function  of  <pp.  For  p 
in  the  z-x  plane, 

cos  a  =  sin  p  sin  0p  cos  tpp  +  cos  p  cos  flp  .  (2) 

If  cos  a  can  be  estimated  for  different  values  of  <p  , 
then  Equation  2  can  be  used  to  solve  for  the  coeffi¬ 
cients  sinp  sinflp  and  cospcostfp.  The  sum  and  dif¬ 
ference  of  these  coefficients  are  cos (p  -  0p)  and 
cos (p  +  0p),  respectively,  and,  from  these  two  quanti¬ 
ties,  p  and  flp  can  be  obtained. 

The  target  projected  orientation  ft0  in  angle-angle 
images  is  also  a  function  of  <p  : 


FIGURE  9.  Geometry  of  a  target  that  is  spinning  (about  the 
target  body  axis)  and  precessing  (about  the  z-axis)  simul¬ 
taneously.  The  parameters  Pp  and  <pp  represent  the  preces¬ 
sion  half-cone  angle  and  the  precessing  azimuthal  angle, 
respectively,  and  a  and  p  represent  the  aspect  angle  and 
mean  aspect  angle,  respectively. 


VOl  UMt  6  NUVBER  i  1993  THE  LI\C0lN  lABOHATORv  JOURNAL 


221 


•  HSU 

ixtr,h  ting  Itirger  l  ,\mo v,  from  An^l,  Auglr  oml  fti/igf Doppler  lnhiiit ■> 


FIGURE  10.  Feature  extraction  for  angle-angle  images  of  a  simulated  spinning  triconic  body  undergoing  precessional 
motion:  (a)  input  images,  (b)  edge  extraction,  (c)  fitted  data  for  cos«,  and  (d)  fitted  data  for  tan  f)0.  The  rotation  rate  chosen 
was  80°/sec,  f)p  was  16°,  the  precession  rate  for  0p  was  10°/sec,  and  p  was  90°.  The  images  were  simulated  with  a  2-cm  pixel 
size  and  a  2-sec  frame  time  (i.e.,  a  2-sec  interval  between  frames),  with  19  frames  generated  over  a  complete  precessional 
cycle. 


Doppler 


FIGURE  11.  Geometry  of  range-Doppler  radar  imagery  of  a  spinning  cone. 


222  T HE  HSlCfUS  UBORMORV  .I0UPMU  VOIir.U  fi  KUMBIR  • 


Extracting  Target  Features  from  Angle-Angle  and  Range- Dapple >  /wages 


sin  6  sin  (p 

tan  Hn  =  - : - : - .  ( }) 

cos  p  sin  ti  cos  i p  -  sin  p  cos  H 

At  broadside  viewing  {p  =  90°),  Equation  3  reduces  to 
tariff,,  =  -tanW,sin0  .  Similar  to  the  case  of  aspect- 
angle  data,  the  orientation  data  can  also  be  fitted  with 
Equation  3  to  determine  p  and  0  . 

As  an  example,  a  series  of  simulated  angle-angle 
images  has  been  generated  for  a  spinning  triconic 
body  undergoing  precessional  motion  (Figure  10[a]). 
The  rotation  rate  chosen  was  80°/sec,  0  was  16°,  the 
precession  rate  for  0  was  10°/sec,  and  p  was  90°. 
(Note:  The  LOS  was  assumed  to  be  outside  the  pre¬ 
cession  cone.)  The  images  were  simulated  with  a 
2-cm  pixel  size  and  a  2-sec  frame  time  (i.e.,  a  2-sec 
interval  between  frames),  with  19  frames  generated 
over  a  complete  precessional  cycle.  The  triconic  bound¬ 
aries  were  identified  by  computer  for  each  image 
frame  (Figure  10[b])  so  that  the  aspect  angle  a  and 
orientation  ()0  could  be  determined.  Fitting  these  angles 
to  Equations  2  and  3  then  allows  p  and  0p  to  be 
obtained.  For  the  aspect-angle  history,  Figure  10(c) 
shows  that  the  best  fit  occurs  for  p  =  91.4°  and 
6p  =  1 5.4°.  For  the  image-plane  body-axis  orientation 
(given  p  =  90°),  Figure  10(d)  shows  that  the  best  fit 
occurs  for  0  =  16°. 

Feature  Extraction  for  Range-Doppler  Images 

For  range-Doppler  radars,  targets  with  angular  dy¬ 
namics  are  imaged  along  the  radar  LOS  in  the  two 
dimensions  of  range  and  Doppler.  The  projection  of 

the  length  of  a  target  measured  along  its  body  axis, 
onto  the  LOS  is  Lcosn  in  the  range  dimension, 
with  a  being  the  aspect  angle,  as  shown  in  Figure  1 1. 
In  the  case  of  a  rotating  target,  the  rotational  angular 
velocity  ^projected  onto  the  LOS  is  measured  in  the 
Doppler  dimension  as  Ksin«  [7].  With  these  rela¬ 
tionships,  the  dimensions  of  a  target  can  be  derived 
from  its  image  range  and  Doppler  extents. 

Application  of  this  technique  is  demonstrated  in 
this  section  with  an  example  of  simulated  range-Dop¬ 
pler  images  of  a  diffuse  cone.  In  the  example,  image 
processing  and  scene  analysis  are  applied,  as  has  been 
demonstrated  earlier  for  angle-angle  imagery.  A  target 
line  model  is  th’  n  used  to  guide  the  formation  of  a 
target  line  representation  in  the  imagery.  Target  di¬ 


mensions  and  orientations  are  then  determined  from 
the  range  and  Doppler  extents  of  a  sequence  of  the 
images. 


Estimation  of  Target  Dimensions 

For  the  cone-shaped  target  of  Figure  1  1 ,  the  range 
and  Doppler  extents  are  related  to  the  target  aspect 
angle  <t  and  the  rotation  rate  /r  by 

R  •  dr  =  L  cos  a  (9) 


and 

V  ■  dv  =  4.7 Df  sin  a  ,  (3) 


where  R  and  V  are  the  range  and  Doppler  extents 
(number  of  image  pixels),  z/rand  dr  arc  the  range  and 
Doppler  cell  sizes,  and  L  and  Dare  the  target  physical 
length  and  base  diameter,  respectively.  Combining 
the  two  equations  to  eliminate  functions  of  the  aspect 
angle  gives  the  following  ellipse  equation  for  a  rigid 
target  undergoing  steady  rotation  with  constant  L  D, 
and  fr: 


R~  + 


dv 


4  *Dfx) 


v-  =  1. 


(6) 


The  major  and  minor  axes  of  the  above  ellipse  can 
be  estimated  for  a  dataset  of  (/?,  V)  pairs  by  the  least- 
squares-error  method.  The  physical  length  L  and  the 
Doppler  velocity  car  thus  be  calculated  by 


and 


:/?V4  - 

i  R4V 2  -  R z  RZVZ 

For  cases  in  which  the  rotation  frequency  fr  is 
known,  the  base  diameter  D  can  also  be  determined 
from  the  Doppler  velocity.  Theoretically,  the  above 
derivation  can  be  performed  for  any  dataset  with 
more  than  two  pairs  of  (R,  V)  measurements,  pro¬ 
vided  that  the  data  noise  or  measurement  errors  are 


VOLUME  0.  NUMBER  :  1993  1HE  LliiCOUi  LABOR* T 0R»  JOURU! 


223 


before  process- 
lines  fitted.  The 
ss  section. 


•HSU 

Extracting  target  Features  from  Angle-Angle  and  Range-Doppler  Images 


much  less  chan  the  true  data  difference  corresponding 
to  different  aspect  angles.  For  a  large  dataset,  the 
measurements  of  range  and  Doppler  extents  can  be 
median  filtered  separately  to  remove  noise  (while  still 
retaining  the  temporal  trend  resulting  from  changing 
aspect  angles). 

With  both  L  and  D/r  derived  from  a  given  dataset, 
the  target  aspect-angle  history  can  be  obtained  from 
either  Equation  4  or  5,  or  from  the  ratio  of  the  two 
equations.  It  should  not  matter  which  equation  is 
used  to  obtain  the  aspect  angle  because  both  of  the 
equations  were  used  to  derive  Equation  6,  which  was 
used  to  estimate  both  L  and  Df.  More  precise  results, 
however,  may  be  obtained  by  using  the  equation  cor¬ 
responding  to  better  cell  resolution. 

Motion  Extraction  for  a  Simulated  Cone 

For  purposes  of  demonstration,  range-Doppler  im¬ 
ages  have  been  simulated  for  a  cone  (length  of  1 50  cm 
and  a  base  radius  of  19.7  cm)  having  a  diffuse  target 
surface.  A  total  of  280  images  was  generated  over 
28  sec  with  the  cone  undergoing  spin  (spin  period  of 
4.5  sec)  and  precession  [8].  Each  image  was  53  x  53 
pixels,  with  a  range  cell  representing  5  cm  and  a 
Doppler  cell  representing  2.93  cm/sec.  Figure  12  (top) 
shows  one  of  the  range-Doppler  image  frames. 

Image  processing,  edge  detection,  and  line  fitting 
were  then  performed  on  the  images.  For  the  simple 
known  target  shapes  in  the  current  application,  the 
line  fitting  was  simplified  with  a  piecewise  linear  fit¬ 
ting  of  the  target  boundaries:  the  edges  at  the  base 
were  collected  and  fitted  linearly,  and  the  cone  was 
defined  by  two  lines  fitting  the  cone  edges  and  the 
base  line.  Range  and  Doppler  extents  of  the  target 
could  then  be  estimated  from  the  target  line  model. 
Figure  12  (bottom)  shows  a  frame  of  the  image  with 
the  edge  points  detected  and  the  lines  fitted.  For  such 
figures,  the  bisection  line  of  the  cone  is  the  target- 
body  centerline,  the  distance  from  the  nose  to  the 
base  center  in  range  is  the  target  range  extent,  and  the 
Doppler  spread  between  the  intersections  of  the  base 
line  and  the  cone’s  left  and  right  sides  is  the  Doppler 
extent. 

Figure  13  contains  plots  of  the  extracted  range  and 
Doppler  extents  from  the  image  sequence.  Note  that, 
in  spite  of  the  considerable  amount  of  local  noise 


Time  (sec) 

(b) 

FIGURE  13.  The  extracted  (a)  range  and  (b)  Doppler  ex¬ 
tents  from  a  sequence  of  simulated  images  (see  Figure 
12). 

present  in  the  imagery,  the  range  and  Doppler  histo¬ 
ries  show  functions  of  cos«  and  sino,  respectively,  for 
the  aspect  angle  a,  which  changes  because  of  the 
precession  of  the  target.  Target  dimensions  can  be 
obtained  by  the  elliptical  fitting  of  the  range  and 
Doppler  data,  as  described  earlier  in  the  subsection 
“Estimation  of  Target  Dimensions.”  With  such  tech¬ 
niques,  a  value  of  153.5  cm  was  calculated  for  the 
target  length,  and  17.3  cm  for  the  base  radius.  Note 
that  the  derived  target  dimensions  agree  well  with  the 


VOLUME  6  NUMBER  1.  1993  1HE  I INCOLN  UBORATORV  JOURNAL  225 


•  HSU 

l  \ll\h  tlHg  hll'gft  l  iillUll!  frulll  An^h  AiI'jI:  .1)1,1  J&IHVt  /)«/>/>/('(  lirl.lgt! 


FIGURE  14.  Aspect-angle  history  derived  from  the  Doppler 
extents  (see  Figure  13[b|). 

corresponding  input  model  dimensions  of  150  cm 
and  19.7  cm.  Both  image-resolution-related  noise  and 
target-surface-scattering  noise  could  have  contributed 
to  the  differences  in  the  derived  and  input  target 
dimensions.  The  quantitative  impacts  of  such  noise 
are  currently  under  evaluation. 

Figure  14  shows  the  aspect-angle  history  that  was 
derived  from  the  Doppler  extents.  A  running  averag¬ 
ing  of  4.5  sec  was  applied  to  smooth  the  aspect-angle 
curve,  which  changes  from  38°  to  47°  with  a  period 
of  about  28  sec.  With  the  assumption  that  the  radar 
LOS  was  outside  the  target  precession  cone,  the  target 
precession  half-cone  angle  can  be  calculated  as  4.5°, 
which  is  comparable  to  the  input  angle  of  5°. 

Summary 

The  extraction  of  target  features — shape,  size,  orien¬ 
tation,  and  dynamics — from  both  angle-angle  and 
range-Doppler  images  has  been  demonstrated.  The 
target  shape,  size  and  orientation  can  be  determined 
from  single-frame  angle-angle  images,  and  the  target 
angular  dynamics  can  be  estimated  from  a  time  series 
of  such  images.  For  the  case  of  range-Doppler  imag¬ 
ery,  multiple  frames  of  images  are  required  to  obtain 
multiple  measurements  of  range  and  Doppler  extents 
to  enable  the  derivation  of  the  target  dimensions.  A 
time  series  of  range-Doppler  images  can  also  be  used 
to  estimate  the  target  angular  dynamics. 


For  simple  known  targets,  the  feature-extraction 
process  can  be  simplified  greutlv  bv  obtaining  the 
target  image  description  with  a  piecewise  linear  ap¬ 
proximation  of  the  target  boundaries.  For  cases  in 
which  some  size/shape  parameters  such  as  cone  halt- 
angle,  length,  and/or  base  diameter  are  available,  the 
target  size  and  aspect  angle  can  be  estimated  from 
data  extracted  tor  each  frame  of  range-Doppler  im¬ 
ages.  Convergence  of  such  estimations  over  multiple 
frames  of  images  can  then  be  used  to  evaluate  the 
algorithm  performance  and  to  verify  measurements 
for  the  known  target. 

The  algorithm  is  undergoing  extensive  testing  with 
simulated  and  field  data,  and  both  the  precision  and 
efficiency  of  the  algorithm  are  expected  to  improve 
over  time.  Preparation  for  real-time  implementation 
of  the  range-Doppler  algorithm  is  currently  in  progress. 

Acknowledgments 

The  author  would  like  to  express  her  sincere  apprecia¬ 
tion  to  Kent  Edwards  for  his  suggestions,  guidance, 
and  support  for  this  article.  Special  thanks  are  due  to 
Marc  Bernstein  and  Thierry  Copie  for  their  interest, 
encouragement,  and  continuous  supply  of  simulated 
and  field  data.  Without  the  simulated  data,  the  assess¬ 
ments  of  algorithm  performance  would  not  have  been 
possible.  Without  the  field  data,  the  algorithm  could 
not  have  been  made  realistic  and  practical.  The  au¬ 
thor  also  thanks  Marianne  Pietrzyk  for  her  assistance 
in  the  development  of  the  algorithm  for  the  extrac¬ 
tion  of  target  features  from  angle-angle  imagery. 

This  work  was  sponsored  by  the  U.S.  Army  Strate¬ 
gic  and  Space  Defense  Command. 


226  Il'l  IiNCOIM  lABORAlOHv  .I0URXAI  VOLlIVf  6  NUVBf R  ' 


•  HSU 

Extracting  larger  Features  from  Angle-Angle  and  Rtinge- Doppler  Images 


REFERENCES 


1.  D.H.  Bullard,  "Generalizing  the  Hough  Transform  to  Detect 
Arbitral)'  Shapes,"  Pattern  Recognition  13,  111  ( 1 D8 1 ). 

2.  R.C.  Gonzalez  and  P.  Winrz,  “Digital  Image  Processing,”  2nd 
ed.  (Addison- Wesley  Publishing  Go.,  Reading,  \1A,  1987). 

3.  P.R.  Reaudet,  “Rotational!)'  Invariant  linage  Operators,”  Proc. 
Inti  Conf.  on  Pattern  Recognition  (Kyoto,  7-10  Nov.  1978), 
p.  379. 

4.  I..S.  Davis,  “Tw>.  Dimensional  Shape  Representation,”  Hand¬ 
book  of  Pattern  Recognition  and  Image  Processing,  eds.  I  .Y. 
Young  and  K.S.  Fu  (Academic  Press,  Inc.,  Orlando.  FL,  1 986). 

3.  H,  Freeman,  "On  the  F.ncoding  ot  Arbitrary  Geometric  Con¬ 
figurations,”  IEEE  Trans.  Elec.  Computers  EC-10,  260(1961). 

6.  K.S.  Fu,  ed..  Syntactic  Pattern  Recognition,  Applications  (Springer- 
Verlag,  Berlin,  1977). 

7.  A.M.  Aull,  R.A.  Gabel,  and  T.J.  Goblick,  "Real-Time  Radar 
Image  Understanding:  A  Machine-Intelligence  Approach,”  Line. 
Lab.  J.  5,  195  (1992). 

8.  T.B.  Copie,  private  communication  (Oct.  1992). 


sc  MAY  MSI 

received  the  following  degrees 
in  electrical  engineering:  a  B.S 
from  the  National  Taiwan 
University  and  a  Ph.D.  from 
Purdue  University,  where  her 
research  focus  w.ls  in  syntactic 
pattern  recognition.  She  later 
joined  the  Corporate  Research 
and  Development  Center  of 
General  Electric  Go.  in 
Schenectady,  New  York,  where 
she  became  involved  in  image 
processing  and  scene  analysis  in 
application  to  nondestructive 
part  inspection  tor  quality 
assurance.  In  1981  she  joined 
Lincoln  I  uboraiorv,  where  she 
is  currently  a  stall  member  in 
the  Signature  Studies  and 
Analysis  Group,  and  her  focus 
ot  research  has  been  in  scene 
analysis  and  signal  processing. 


V01UME  6  NUMBER  !  1993  THE  LINCOLN  l ABORAt ORs  JOURNAL 


227 


